Reliability analysis of replicated and-or graphs

(1)

De-Ron Liang,1_{Rong-Hong Jan,}2 _{Satish K. Tripathi}3 1

Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic of China

2

Department of Information and Computer Science, National Chiao Tung University Hsinchu 30050, Taiwan, Republic of China

3

Department of Computer Science, University of Maryland, College Park, Maryland 20742

Received 9 March 1994; accepted 6 July 1995

Abstract: A computation task running in distributed systems can be represented as a directed graph

H (V, E ) whose vertices and edges may fail with known probabilities. In this paper, we introduce a reliability measure, called the distributed task reliability, to model the reliability of such computation tasks. The distributed task reliability is defined as the probability that the task can be successfully executed. Due to the and-fork / and-join constraint, the traditional network reliability problem is a special case of the distributed task reliability problem, where the former is known to be NP-hard in general graphs. For two-terminal and – or series-parallel ( AOSP ) graphs, the distributed task reliability can be computed in polynomial time. We consider a graph Hk_{(Vˆ , Eˆ) , named a k-replicated and – or series-parallel ( RAOSP )} graph, which is obtained from an AOSP graph H (V, E ) by adding ( k0 1 ) replications to each vertex and adding proper edges between two vertices. It can be shown that the RAOSP graphs are not AOSP graphs; thus, the existing polynomial algorithm does not apply. Previously, only exponential time algorithms as used in general graphs are known for computing the reliability of Hk_{(Vˆ , Eˆ) . In this paper, we present a} linear time algorithm with O ( K (ÉV É/ É EÉ) ) complexity to evaluate the reliability of the graph Hk

(Vˆ , Eˆ) , where KÅ max { k2

22 k , 23 k

} . q 1997 John Wiley & Sons, Inc. Networks 29: 195–203, 1997

1. INTRODUCTION In the task graph H (V, E ) , V is the set of vertices which

represents modules and E is the set of edges which repre-sents the messages passing links between two modules. In the past decade, distributed processing systems have

To increase the survival rate of the task, a straightfor-become increasingly popular because they provide a

po-ward method is to replicate the complete task several tential increase in reliability, throughput, fault tolerance,

times and to execute it independently on distinct comput-and resource utilization. Usually, the computation task of

ers. The primary site approach [ 2 ] is one such example. a distributed processing system can be partitioned into a

The disadvantage of this approach is that the system can-set of software modules ( or simply, modules ) and then

not tolerate more than one fault in each replicated task. modeled as a directed graph H (V, E ) , called a task graph.

Recently, the replication of software modules was pro-posed and implemented, such as in Maruti [ 8 ] and

Delta-4 [10 ] . The idea behind this approach can be illustrated Correspondence to: D.-R. Liang

(2)

nent failures [10 ] . It has been reported that most of the

hardware failures in computer systems are transient fail-ures [ 8 ] . The random events of failfail-ures in modules or communication links can be considered as independent, provided that the software components are replicated with the N-version programming technique and the hardware failures are assumed to be transient.

Suppose that the modules and the communication links have a certain probability of being operational. Then, there is a certain probability, called the distributed task

reliability, associated with the event that a task completes

successfully. This measure accurately models the

reliabil-Fig. 1. ( a ) A fork-join task graph; ( b ) its replicated graph. _{ity of a task running in distributed systems. Due to the}

and-fork / and-join constraint of the task graph, the

tradi-tional network reliability problem [1, 3, 4, 6, 7, 11, 12 ] in the following example: Consider a simple application

is a special case of the distributed task reliability problem, modeled by an and-fork / and-join graph as shown in

Fig-where the former is known to be NP-hard for general ure 1 ( a ) . ( By convention, this task operates only if all

graphs. For the two-terminal series-parallel ( TTSP ) the modules as well as the links operate.) Suppose that

graphs, the distributed task reliability can be found in the application is implemented with an extra replication.

polynomial time using the technique developed in [13 ] . In this approach, each module receives messages not only

For the two-terminal and – or series-parallel ( AOSP ) from its predecessors in the same replica, but also from

graphs, we will show in Section 2 that their distributed the corresponding predecessors in the other replica.

Fig-task reliability can also be calculated in polynomial time ure 1 ( b ) shows one such implementation. Thus, a task

using the same technique [13 ] . In this paper, we consider finishes successfully only if there is a set of modules

a graph Hk

(Vˆ , Eˆ), named the k-replicated and–or series-which forms this application, and their associated

commu-parallel graph ( k-replicated AOSP graph or, more, simply

nication links are operational. Obviously, this application

RAOSP graph ) , which is obtained from an AOSP graph may tolerate more than one fault in each task replication,

H (V, E ) by adding ( k 0 1 ) replications to each vertex depending on the fault patterns. Figure 2 shows a few

and adding proper links between vertices. The main con-examples where the task in Figure 1 ( b ) is operational.

tribution of this paper is the design of a linear time algo-The modules and the communication links may fail

rithm to calculate the reliability of the RAOSP graph due to two main factors: software failure and hardware

Hk

(Vˆ , Eˆ), given the base AOSP graph H(V, E) and the

failure. The software failures are mainly due to the design

replication degree k . It can be shown that the RAOSP faults or implementation faults. The reliability can be

graphs are not AOSP graphs; thus, the existing polyno-increased when the software components are replicated

mial algorithm does not apply. Previously, only exponen-with the N-version programming approach [ 5 ] . The

hard-ware failures are due to either transient failures or perma- tial time algorithms, as used in general graphs, are known

(3)

1. All modules in the task graph H (V, E ) are perfectly reliable.

2. Any communication link may fail with a known proba-bility.

3. All failures are assumed to occur independently of each other.

We remind the reader that we will extend this model to consider the case of unreliable modules in Section 6.

Formally, a two-terminal AOSP graph of type k , where

k √ { L , PA, PO, S } ( which means leaf, parallel-and, parallel-or, and series, respectively ) is recursively

de-fined as follows:

Fig. 3. ( a ) Sequential graph; ( b ) and-fork / and-join graph;

( c ) or-fork / or-join graph.

1. A single edge ( s , t ) comprises an AOSP graph of type

L with terminals s and t . The system operates if that

edge operates. to find the reliability of Hk

(Vˆ , Eˆ). In this paper, we

Let Hi be an AOSP graph with terminals si and ti for present a linear time algorithm with O ( K (É_VÉ _{/ É}_EÉ_{) )}

iÅ 1, 2. complexity to compute the reliability of the graph Hk

(Vˆ ,

Eˆ ), where KÅ max { k2

22 k

, 23 k

} . 2. The graph H1ÚH2is an AOSP graph of type PAwith

The rest of the paper is organized as follows: In the terminals s and t , where the graph associated with H1

next section, the definition of AOSP graphs is given and ÚH2is the disjoint union of H1and H2, with s1

identi-the reliability evaluation for AOSP graphs is discussed. fied with s2 and t1 identified with t2, and the system

In Section 3, we define the k-replicated AOSP graph and operates if both H1and H2 operate.

its reliability. An algorithm for computing the reliability _{3. The graph H}

1ÛH2is an AOSP graph of type POwith of a k-replicated AOSP graph Hk

(Vˆ , Eˆ) is developed in _{terminals s and t , where the graph associated with H}₁ Section 4. The algorithm is shown to have an O ( K (É_VÉ

Û H2 is the same as that associated with H1 Ú H2,

/ ÉEÉ_{) ) time complexity, where K}_Å_{max { k}2₂2 k_{, 2}3 k_{} .}

except that the system operates if either H1or H2

oper-A numerical example is given in Section 5. Section 6 _ates. presents an extension to the model and considers the cases

4. The graph H1∗H2 is an AOSP graph of type S with

of partial replication as well as unreliable vertices.

Fi-terminals s1 and t2, where the graph associated with

nally, concluding remarks are presented.

H1∗H2is the disjoint union of H1and H2with t1

identi-fied with s2, and the system operates if both H1 and

H2operate.

2. RELIABILITY EVALUATION

We note that the TTSP graphs [13 ] can be formulated

OF AOSP GRAPHS

recursively using only the operations 1, 3, and 4 of the AOSP graphs defined previously. In other words, the class of the TTSP graphs is a subclass of the AOSP graphs. Consider a computation task graph H (V, E ) consisting of

The distributed task reliability of task H , denoted by a set of software modules V and a set of communication

R ( H ) , is defined as the probability that the task H

oper-links E , which represents message passings between

soft-ates. For example, if H contains only one edge e of type ware modules. According to the logical structures and

L , R ( H )År ( e ) , where r ( e ) is the reliability of edge e .

precedence relationship among the modules, a large class

If H consists of two AOSP graphs H1and H2, then

of task graphs can be expressed by a combination of three common types of subgraphs [ 5, 9 ] : sequential, and-fork to and-join ( AFAJ ) , and or-fork to or-join ( OFOJ ) [ see Fig. 3 ( a ) – ( c ) ] , where AFAJ and OFOJ subgraphs may

R ( H ) Å R ( H1) R ( H2) 10 ( 10R ( H1) ) ( 1 0R ( H2) ) if HÅH1ÚH2 or H ÅH1∗H2 if HÅH1Û H2. ( 1 ) consist of several sequential subgraphs in a parallel

struc-ture. In this paper, we restrict our task graphs to contain a combination of these three types of subgraphs. This type of the graph can be modeled as a two-terminal AOSP

graph. To state the model, we make the following assump- To compute R ( H ) , we first describe an AOSP graph

H by a binary tree structure, T ( H ) , called the parsing

(4)

tree of H . For example, Figure 4 ( b ) depicts a parsing

tree of the AOSP graph H in Figure 4 ( a ) . The nodes in the parsing tree are numbered ( at the upper right corner ) according to their postorder sequence. Each leaf node in

T ( H ) corresponds to an AOSP subgraph of type L , i.e.,

a single-edge AOSP graph in H . In Figure 4, for example,

H1Å e1, H2 Åe2, H4Å e3, and H5 Åe4. Each internal

node is labeled by S , PO, or PAaccording to the type of that AOSP subgraph. An internal node numbered x in

T ( H ) along with all of its descendant nodes induce a

Fig. 5. ( a ) A type L AOSP graph; ( b ) the corresponding

k-subtree Txwhich also describes an AOSP subgraph Hxin

replicated RAOSP; ( c ) the 2-replicated RAOSP. H . For example, subtree T3 describes the AOSP graph

H3 Å H1∗H2, in Figure 4 ( b ) . Given the parsing tree

T ( H ) , we can obtain R ( H ) using Eq. ( 1 ) to compute the

execution; then, a replicated edge in Hk

represents a

rep-R ( Hx) level by level for every node x in the parsing tree

lica thread of the corresponding thread in H [ such as ( si, T ( H ) . For example, we first find R ( H1) and R ( H2) in

tj) , 1 ° i , j ° k , are a replica of ( s , t ) in Fig. 5 ( b ) .] Figure 4 ( b ) , where H1 Å e1 and H2 Å e2. Next, we

For clarity, we introduce the concept of ‘‘ correct value consider subtrees T3and compute R ( H3)ÅR ( H1) R ( H2) .

at terminal. ’’ Without loss of generality, the source

termi-Finally, we can determine R ( H )ÅR ( H7)ÅR ( H3ÚH6)

nal of an edge is either offered a valid input value or a

ÅR ( H3) R ( H6) .

nil value. Suppose that a valid input value is offered at Note that an AOSP graph H (V, E ) can be translated

the source terminal s of an AOSP graph H ; then, the sink into its parsing tree in O (É_EÉ_{) time using the algorithm}

terminal t of H is said to have a ‘‘ correct ’’ value with proposed by Valdes et al. [13 ] . In other words, R ( H ) can

respect to the valid input value if H operates.

Further-be found in O (É_EÉ_{) time.}

more, the sink terminal t has a n il value if either the source terminal is offered as a valid input value but H fails to operate or the source is offered a nil value regardless of

3. TASKS WITH REPLICATION

the operation of H .*

Consider a k-replicated AOSP graph Hk

. We first no-To increase the reliability of a task, we replicate the

mod-tice that it has k source terminals and k sink terminals. ules and the message passing links of the task. The

k-Without loss of generality, we assume that those source

replicated task graph Hk

(Vˆ , Eˆ) of H(V, E) is created by

terminals of Hk

which are offered valid input have the replicating each vertex in V ( k01 ) times and letting the

same input value. It is obvious that whether a sink termi-edges in Hk

be established in such a way that each vertex

nal of Hk

has a correct output depends not only on which is not only descendant of its predecessors in the same

source terminals of Hk

are offered valid input values but replica, but is also descendant to the corresponding

prede-also on the execution of Hk

. Let S and T be the sets of cessors in the other replicas. For example, Figure 5 ( b )

source and sink terminals of Hk

, respectively. Let A be shows a k-replicated task graph created from an AOSP

the set of source terminals offered with the same valid graph of type L in Figure 5 ( a ) .

input value and B be the set of sink terminals with correct Imagine that each edge in H represents a thread of

outputs; A ⊆ S , B ⊆ T . Hk

is said to operate w.r.t. ( A ,

B ) if∀tj√B , tjhas a correct value and ∀t*j √T"B , t*j has nil value, given that ∀si √ A , si are offered with a valid input value and other source terminals are offered a nil value. Now, we are ready to formally define the k-replicated AOSP graph:

1. ( L ) : A k-replicated AOSP structure of type L consists of a set of k source terminals, SÅ{ s1, . . . , sk} , a set of k sink terminals, TÅ { t1, . . . , tk} , and k

2

edges, ( si, tj) ,∀i , j . Let A⊆S , B ⊆T . The system operates w.r.t. ( A , B ) iff ∀tj √ B ,∃si √A , such that ( si, tj)

Fig. 4. ( a ) A fork-join AOSP graph; ( b ) the corresponding

(5)

R ( Hk )Å Pr { < B⊆T,Bx0/ ESB( H k ) } , operates and∀t*j √ T"B , "∃si √ A such that ( si, t*j)

operates. Å

∑

M xB⊆T pSB( H k ) . ( 2 ) Let Hk

ibe a k-replicated AOSP graph with terminals set Si and Ti, and let Ai ⊆ Si, Bi ⊆ Ti, for iÅ1, 2. 2. ( PA) : The graph H

k

1 Ú H

k

2 is a k-replicated AOSP A straightforward way to compute R ( Hk) is to enumerate

graph of type PA with terminals set S and T , where the execution outcomes of all the edges in Hk( Eˆ , Vˆ ), the graph associated with Hk

1 Ú H

k

2 is the disjoint where ÉEˆÉ Å k2rÉEÉ and set E is an edge set in H .

union of Hk

1and Hk2, with S1identified with S2( i.e., However, this method takes O ( 2k

2É_{E É}

) . In next section,

s1i √S1 identified with s2i √S2, 1 ° i ° k ) and T1 a linear time algorithm for computing R ( Hk) will be

pre-identified with T2. Furthermore, S Å S1 Å S2 and T sented.

ÅT1ÅT2. Suppose that A⊆S and B⊆T . The system

operates w.r.t. ( A , B ) iff Hk

1operates w.r.t. ( A , B1) ,

Hk

2operates w.r.t. ( A , B2) , and B1> B2ÅB . 4. RELIABILITY EVALUATION OF RAOSP

GRAPHS

3. ( PO) : The graph Hk1 Û Hk2 is a k-replicated AOSP

graph of type PO with terminals set S and T , where

In this section, we present an algorithm to compute the the graph associated with Hk

1 Û Hk2 is the disjoint

reliability of a k-replicated task Hk

(Vˆ , Eˆ) in O(KÉ_EÉ₎

union of Hk

1and H

k

2, with S1identified with S2and T1

time. Note that KÅmax { k2

22 k

, 23 k

} . We first consider identified with T2. Let S Å S1 Å S2, and T Å T1

the k-replicated task graph of type L . [ See Fig. 5 ( b ) for

Å T2. Suppose that A ⊆ S and B ⊆ T . The system

an example of Hk

.] Suppose that A ⊆ S and B⊆T . For

operates w.r.t. ( A , B ) iff Hk

1operates w.r.t. ( A , B1) ,

any t √B , the probability that at least one edge ( s , t ) , Hk

2operates w.r.t. ( A , B2) , and B1< B2ÅB .

s√A , operates is ( 10∏s√ APr[ ( s , t ) fails ] ) . Thus, for 4. ( S ) : The graph Hk

1∗H

k

2is a k-replicated AOSP graph

the k-replicated AOSP graph Hk

of type L , we have of type S with terminals sets SÅS1and TÅT2, where

the graph associated with Hk

1∗H

k

2is the disjoint union

( L ) : pAB( H k )Å{

∏

t√ B ( 10

∏

s√ A Pr [ ( s , t ) fails ] ) } of Hk

1and Hk2with T1identified with S2. Suppose that

A ⊆ S and B ⊆ T . The system operates w.r.t. ( A , B )

1{

∏

t√ T "B

∏

s√ A Pr [ ( s , t ) fails ] } . ( 3 ) iff Hk 1operates w.r.t. ( A , B1) , H2koperates w.r.t. ( A2, B ) , and B1ÅA2.

For convenience, let M denote the matrix of pAB( Hk) , so that M Å [ pAB( Hk) ]A⊆S,B⊆T with dimension 2k 1 2k. Let EAB( Hk) be a probability event that Hkoperates w.r.t.

terminal sets ( A , B ) . We denote pAB( Hk) Å Pr { EAB For example, a 2-replicated AOSP graph of type L , H2(Vˆ , Eˆ ) is obtained from edge (s, t), where Vˆ Å { s1, s2, t1,

( Hk

) } . We notice that for any set A⊆ S , (B⊆TpAB( Hk)

Å1. Given that all the source terminals are offered with t2} and Eˆ Å { ( s1, t1) , ( s1, t2) , ( s2, t1) , ( s2, t2) } . Let

probabilities r1, r2, r3, and r4 ( rV1, rV2, rV3, and rV4) be the

the valid input value initially, i.e., AÅS , the reliability

of Hk

is the same as the probability that at least one sink reliabilities ( unreliabilities ) of edges ( s1, t1) , ( s2, t1) , ( s1,

t2) , and ( s2, t2) , respectively. Then, the matrix M of H 2

terminal has a correct value. Thus, the reliability of Hk

is defined as is MÅ p{s₁,s₂} , {t₁,t₂}( H2) p{s₁,s₂} , {t₁}( H2) p{s₁,s₂} , {t₂}( H2) p{s₁,s₂} , {}( H2) p{s1} , {t1,t2}( H 2 ) p{s1} , {t1}( H 2 ) p{s1} , {t2}( H 2 ) p{s1} , {}( H 2 ) p{s2} , {t1,t2}( H 2 ) p{s2} , {t1}( H 2 ) p{s2} , {t2}( H 2 ) p{s2} , {}( H 2 ) p{} , {t1,t2}( H 2 ) p{} , {t1}( H 2 ) p{} , {t2}( H 2 ) p{} , {}( H 2 ) Å ( 1 0_rV1rV2) ( 1 0rV3rV4) ( 1 0rV1rV2) rV3rV4 ( 10 rV3rV4) rV1rV2 rV1rV2rV3rV4 r1r3 r1rV3 r3rV1 rV1rV3 r2r4 r2rV4 r4rV2 rV2rV4 0 0 0 1 , ( 4 )

(6)

where {} represents the null set. ( PO) : pAB( H k 1ÛH k 2) Next, we consider Hk Å Hk 1Ú H k 2. By definition, the Å

∑

B1⊆T1,B2⊆T2 s .t .B1<B2ÅB pAB1( H k 1)rpAB2( H k 2) . ( 7 ) terminal sets S and T of Hk

are given as SÅS1ÅS2and

TÅT1ÅT2. Suppose that A⊆S , B1⊆T1, and B2⊆ T2;

then, the event [ EAB1( H k

1) ] Ú [ EAB2( H k

2) ] implies the

For example, let H2

i be a 2-replicated AOSP graph event EAB( H

k

1Ú H

k

2) if B1> B2 ÅB . Thus,

with terminal sets Siand Ti for iÅ1, 2, respectively. Let S1Å { s11, s12} , S2Å{ s21, s22} , T1 Å{ t11, t12} , and T2

EAB( Hk1ÚHk2)Å <

B1⊆T1,B2⊆T2 s .t .B1>B2ÅB

{ EAB₁( Hk1) ÚEAB₂( Hk2) } , _Å_{{ t}

21, t22} . Let ( H21 Û H22) be the 2-replicated AOSP

graph obtained from H2 1 and H

2

2with terminals ( s1, s2)

and ( t1, t2) . Note that ( s1, s2) Å ( s11, s12) Å ( s21, s22)

and ( t1, t2) Å ( t11, t12) Å ( t21, t22) . Then, for any A

i.e.: ⊆{ s1, s2} , ( PA) : pAB( Hk1ÚHk2) p_{A , {}}( H₁2ÛH2₂)Å p_{A , {}}( H2₁) p_{A , {}}( H2₂) , Å

∑

B1⊆T1,B2⊆T2 s .t .B1>B2ÅB pAB1( H k 1)rpAB2( H k 2) . ( 5 ) pA , {t1}( H 2 1ÛH 2 2) Å pA , {}( H 2 1) pA , {t21}( H 2 2) /pA , {t11}( H 2 1) pA , {}( H 2 2) /pA , {t11}( H 2 1) pA , {t21}( H 2 2) ,

i be a 2-replicated AOSP graph

with terminal sets Siand Ti for iÅ1, 2, respectively. Let p_{A , {t} 2}( H 2 1ÛH 2 2) Å pA , {}( H 2 1) pA , {t22}( H 2 2) ( 8 ) S1Å { s11, s12} , S2Å{ s21, s22} , T1 Å{ t11, t12} , and T2 /pA , {t12}( H 2 1) pA , {}( H 2 2)

Å{ t21, t22} . Let ( H21 Ú H22) be the 2-replicated AOSP

graph obtained from H2

1 and H22with terminals ( s1, s2)

/pA , {t12}( H

2

1) pA , {t22}( H

2 2) ,

and ( t1, t2) . Note that ( s1, s2) Å ( s11, s12) Å( s21, s22)

and ( t1, t2) Å( t11, t12) Å( t21, t22) . _p_{A , {t}

1,t2}( H

2

1Û H22)Å10[ pA , {}( H21ÛH22)

By Eq. ( 5 ) , for any A⊆ { s1, s2} , we have

/pA , {t1}( H 2 1ÛH22) pA , {t1,t2}( H 2 1ÚH22)Å pA , {t11,t12}( H 2 1) pA , {t21,t22}( H 2 2) , / pA , {t2}( H 2 1ÛH22) ] . pA , {t1}( H 2 1ÚH22) Å pA , {t11,t12}( H 2 1) pA , {t21}( H 2 2) _{Finally, we consider H}k Å Hk 1∗H k 2. By definition of

the k-replicated AOSP graph of type S , the event

/pA , {t11}( H 2 1) pA , {t21,t22}( H 2 2) [ EAB1( H k 1)> EA2B( H k

2) ] implies the event EAB( H k 1∗H k 2) if B1ÅA2. So, we have /pA , {t11}( H 2 1) pA , {t21}( H 2 2) , EAB( Hk1∗Hk2) Å < B1⊆T1,A2⊆S2 s .t .B1ÅA2 [ EAB1( H k 1) ÚEA2B( H k 2) ] . pA , {t₂}( H21ÚH22) Å pA , {t₁₁,t₁₂}( H21) pA , {t₂₂}( H22) ( 6 ) /pA , {t12}( H 2 1) pA , {t21,t22}( H 2 2) Therefore, /pA , {t12}( H 2 1) pA , {t22}( H 2 2) , ( PS) : pAB( Hk1∗Hk2) pA , {}( H 2 1 ÚH 2 2)Å10[ pA , {t1,t2}( H 2 1ÚH 2 2) Å

∑

B1⊆T1,A2⊆S2 s .t .B1ÅA2 pAB1( H k 1)rpA2B( H k 2) . ( 9 ) /pA , {t1}( H 2 1ÚH22) / pA , {t₂}( H21ÚH22) ] .

i be a 2-replicated AOSP graph with terminal sets Siand Ti for iÅ1, 2, respectively. Let Thus, a 22

122

matrix MÅ[ pAB( H

2

) ]A⊆S,B⊆Tfor graph S₁Å { s₁₁, s₁₂} , S₂Å{ s₂₁, s₂₂} , T₁ Å{ t₁₁, t₁₂} , and T₂ H2

1Ú H22can be obtained using the above equations. Å{ t₂₁, t₂₂} . Let ( H2₁∗H2₂) be the 2-replicated AOSP

ob-tained from H2

1 and H22 with terminals ( s1, s2) and ( t1,

Similarly, let Hk_Å Hk

1Û Hk2with terminal sets S and

T , where SÅS1 ÅS2and T ÅT1Å T2. Then, for any t2) . Note that ( s1, s2)Å( s11, s12) , ( t11, t12)Å( s21, s22) ,

and ( t1, t2) Å( t21, t22) . Then,

(7)

Fig. 6. ( a ) A numerical example; ( b ) its corresponding parsing tree.

It is known that Step 1 takes O (É_EÉ_{) [13 ] . In Step 2,}

pAB( H 2 1∗H 2 2)ÅpA , {t11,t12}( H 2 1) p{s21,s22} ,B( H 2 2) if Hk

x is of type L , it takes O ( k2) time to compute each

/pA , {t₁₁}( H21) p{s₂₁} ,B( H22)

( 10 ) entry in matrix Mx and, thus, O ( k22k) time in total for matrix Mx. If H k x is of type PAor PO, it takes O ( 2 k_r 2k ) /pA , {t12}( H 2 1) p{s22} ,B( H 2 2)

to compute each row in matrix Mxand, thus, O ( 2

3 k ) time /pA , {}( H21) p{} ,B( H22) . for matrix Mx. If H k x is of type S ( say H k x Å H k y∗H k z) , matrix Mx is obtained from multiplying My by Mz and, Let Hk

(Vˆ , Eˆ) be the k-replicated AOSP graph derived thus, it takes O ( 23 k

) time to compute matrix Mx. Thus, from H (V, E ) . Since Hk

is derived from H , the structure the total time in Step 2 is O ( max { k2

22 k_É

EÉ_{, 2}3 kÉ_EÉ_{} ) .}

of the parsing tree of H , T ( H ) , is equivalent to the struc- Hence the time complexity of Algorithm 1 is O ( KÉ_EÉ_{) ,}

ture of the parsing tree of Hk , T ( Hk ) , i.e., T ( H )áT ( Hk ) . where KÅmax { k2 22 k , 23 k } . The only difference is that the leaf nodes in T ( H ) are

type L AOSP graphs, whereas the leaf nodes in T ( Hk ) are type L RAOSP graphs with degree k . Therefore,

T ( Hk

) can be obtained by applying the algorithm [13 ] to _{5. A NUMERICAL EXAMPLE} its base AOSP graph H .

As discussed, each leaf node with the postorder se- _{We illustrate the calculation of the distributed task} relia-quence x in T ( Hk

) corresponds to an RAOSP subgraph _{bility in this section through an example: Consider a} 2-of type L , denoted as Hk

x. Every internal node x is labeled _{replicated AOSP graph H}2

as shown in Figure 6 ( a ) , by S , PO, or PA according to the type of that RAOSP _{which is generated from the AOSP graph in Figure 4 ( a ) .} subgraph. Similar to the T ( H ) , every internal node x in _It _is _readily _seen _that _H2

Å H2 7 Å ( H 2 1∗H 2 2) T ( Hk

) , along with all its descendant nodes, induces a _Ú ( H2

4∗H25) , where H21, H22, H24, and H25are 2-replicated

subtree Txwhich describes a k-replicated AOSP subgraph _{AOSP subgraphs of type L . [ See Fig. 5 ( c ) for k} _Å _2.] Hk

xin H k

. Therefore, Mrfor root node r can be determined _{In analogy to the parsing tree of H in Figure 4 ( b ) , the} by computing the Mx level by level for every node x in _{parsing tree for H}2

is given in Figure 6 ( b ) . the parsing tree T ( Hk

) using Eqs. ( 3 ) , ( 5 ) , ( 7 ) , and ( 9 ) .

Suppose that the reliability of each link in H2

i is ri, Finally, R ( Hk

)Å(M xB⊆TpSB( H k

) is emerged in the first

for iÅ 1, 2, 4, 5. Let r1 Å0.9, r2 Å0.8, r4 Å0.7, and

row of matrix Mr. _r

5Å 0.7. To calculate R ( H2) , we first calculate the Mi We now present the algorithm:

matrices for H2

i, i Å1, 2, 4, 5. From Eq. ( 4 ) , we have

Algorithm 1

STEP 1. Find the parsing tree of the graph Hk, denoted

M1Å 0.9801 0.0099 0.0099 0.0001 0.81 0.09 0.09 0.01 0.81 0.09 0.09 0.01 0 0 0 1 , as T ( Hk

) , by applying Valdes’ algorithm to H [13 ] . STEP 2. Evaluate the matrix Mx for each node x in the

parsing tree T ( Hk

) by postorder traversal.

STEP 3. Compute R ( Hk) Å (_⊆xB⊆T pSB( Hk) , where the

M2Å 0.9216 0.0384 0.0384 0.0016 0.64 0.16 0.16 0.04 0.64 0.16 0.16 0.04 0 0 0 1 ,

terms pSB( Hk) can be found in the first row of matrix Mrat root node r in T ( Hk) .

(8)

M3ÅM1∗M2 M4Å 0.8281 0.0819 0.0819 0.0081 0.49 0.21 0.21 0.09 0.49 0.21 0.21 0.09 0 0 0 1 , Å 0.9159 0.0408 0.0408 0.0025 0.8617 0.0599 0.0599 0.0185 0.8617 0.0599 0.0599 0.0185 0 0 0 1 , M6ÅM4∗M5 M5Å 0.8281 0.0819 0.0819 0.0081 0.49 0.21 0.21 0.09 0.49 0.21 0.21 0.09 0 0 0 1 . Å 0.7660 0.1022 0.1022 0.0296 0.6116 0.1283 0.1283 0.1318 0.6116 0.1283 0.1283 0.1318 0 0 0 1 . Next, we consider the subgraphs ( H2

1∗H 2 2) and

( H2

4∗H25) , i.e., the subtrees rooted at nodes 3 and 6 in Finally, we consider H2. Given M3and M6, we obtain

Figure 6 ( b ) . Applying Eq. ( 10 ) , we have the matrix MÅM7for H2using Eq. ( 6 ) . Thus,

MÅM7Å p{s ,s= } , {t ,t = }( H2) p{s ,s= } , {t }( H2) p{s ,s= } , {t = }( H2) p{s ,s= } , {}( H2) p{s } , {t ,t= }( H2) p{s } , {t }( H 2 ) p{s } , {t= }( H2) p{s } , {}( H 2 ) p{s= } , {t ,t = }( H 2 ) p{s= } , {t }( H 2 ) p{s= } , {t = }( H 2 ) p{s= } , {}( H 2 ) p{} , {t ,t= }( H2) p{} , {t }( H2) p{} , {t= }( H2) p{} , {}( H2) Å 0.7016 0.1290 0.1290 0.0404 0.5270 0.1549 0.1549 0.1632 0.5270 0.1549 0.1549 0.1632 0 0 0 1 .

Finally, the reliability of H2

can be obtained by Eq. ( 2 ) : graph of type L and is fully replicated. The RAOSP graph

H*k

Å ( VV , EV) is partially replicated if VV ⊆ Vˆ and/or EV

⊆Eˆ . To compute R(Hk

) , we first add those unreplicated

R ( H2 )Å Pr { < B⊆T,Bx0/ ESB( H 2 ) }Å

∑

0/ xB⊆T pSB( H 2 )

vertices and edges ( vertices and edges that would have been in a fully replicated Hk

) into VV and EV and simply

Å p{s ,s= } , {t ,t = }( H 2 )/p{s ,s= } , {t }( H 2 ) /p{s ,s= } , {t = }( H 2 )

set the reliability of those edges to 0, i.e., Pr { ( si, tj)

Å0.7016/0.1290/0.1290Å0.9596. _{operates }}_Å_0,_∀_{( s}_i_{, t}_j₎_√_Eˆ _Ú_{( s}_i_{, t}_j₎_√_{/ E}_{V . After adding}

those edges, H*k

becomes fully replicated and R ( H*k ) can be obtained via Eq. ( 3 ) . Now, we summarize the

6. FURTHER DISCUSSION _{reliability analysis of the general RAOSP graph H}k

with partial replication; we first find its parsing tree using the In Section 4, we assumed that a k-replicated AOSP graph same algorithm in [13 ] . Then, we convert each RAOSP is fully replicated, i.e., each vertex is replicated exactly graph of type L in Hk

into a fully replicated RAOSP graph ( k01 ) times and each edge is replicated ( k01 )2

times. of type L as shown above. Then, we apply Steps 2 and Furthermore, we assumed that all vertices are perfectly 3 in Algorithm 1 to obtain the reliability of Hk

.

reliable. In this section, we first extend our model to We next consider the RAOSP graphs with unreliable consider those cases where vertices and edges are partially vertices. We begin the discussion with the AOSP graphs. replicated. Later on, we present a solution method to the To incorporate the unreliable vertices into our graph problem that both vertices and edges can fail. model, for each vertex ( or terminal ) t in an AOSP graph, To calculate the reliability of partially replicated we replace it by an edge ( t , t*) and assign the reliability RAOSP graph, we first convert such a graph into a fully of that edge to be the failure probability of that vertex. replicated RAOSP graph, then calculate its reliability us- [ See Fig. 7 ( b ) as an example.] Similarly, to consider the unreliable vertices ( or terminals ) in an RAOSP graph Hk

, ing Algorithm 1. Suppose that Hk_Å

(9)

graph problems there may also exist polynomitime al-gorithms for RAOSP graphs provided there exist polyno-mial-time algorithms for AOSP graphs.

REFERENCES

[1] A. Agrawal and R. E. Barlow, A survey of network relia-bility and domination theory. Oper. Res. 32 ( 1984 ) 478 – 492.

Fig. 7. ( a ) A single vertex; ( b ) its AOSP equivalent; ( c ) a

[ 2 ] P. A. Alsberg and J. D. Day, A principle for resilient

vertex and its replica; ( d ) its RAOSP equivalent.

sharing of distributed resources. Proceedings of the 2nd International Conference on Software Engineering ( Oct. 1976 ) 562 – 570.

we can also replace each terminal in Hk

by an edge. As

[ 3 ] S. Arnborg and A. Proskuronski, On network reliability.

shown in Figure 7 ( c ) and ( d ) , it is a terminal after being

Discr. Appl. Math. 23 ( 1 ) ( 1989 ) 11 – 24.

replicated once and each terminal being replaced by an

[ 4 ] M. O. Ball, Computational complexity of network

relia-edge. We notice that this structure is an RAOSP graph

bility analysis: An overview. IEEE Trans. Reliab.

R-of type L with partial replication, and, thus, its reliability

35 ( 3 ) ( 1986 ) 230 – 239. can be obtained using the method presented in the

previ-[ 5 ] W. W. Chu and K. K. Leung, Module replication and

ous section.

assignment for real-time distributed processing systems. Proceed. IEEE 75 ( 1987 ) 547 – 562.

[ 6 ] C. J. Colbourn, The Combinatorics of Network Reliabil-7. CONCLUSIONS

ity. Oxford University Press, New York ( 1987 ) . [ 7 ] S. Hariri and C. S. Raghavendra, Syrel: A symbolic

re-This paper has focused on the design of an efficient

algo-liability algorithm based on path and cutset methods.

rithm to predict the reliability of tasks characterized by

IEEE Trans. Comput. C-36 ( 1987 ) 1224 – 1232.

replicated and – or series-parallel ( AOSP ) graphs. A

k-[ 8 ] S.-T. Levi, S. Tripathi, S. Carson, and A. Agrawala, The

replicated AOSP graph is derived from an AOSP graph

maruti hard real-time operating system. ACM Oper. Syst.

with vertex and edge replications. Conventional

algo-Rev. 23 ( 1989 ) 90 – 105.

rithms may apply to compute the reliability of a

k-repli-[ 9 ] V. W. Mak and S. F. Lundstrom, Predicting performance

cated AOSP graph. However, these algorithms take expo- _{of parallel computations. IEEE Trans. Parallel Distrib.} nential time in the number of edges. We have presented _{Syst. 1 ( 1990 ) 257 – 270.}

an algorithm with time complexity O ( K (É_VÉ _{/ É}_EÉ_{) ) ,}

[10 ] D. Powell, Delta-4: Overall System Specification. The

where K Å max { k2

22 k

, 23 k

} andÉ_VÉ _and É_EÉ _{are the}

Delta-4 Project Consortium ( 1988 ) .

number of vertices and edges, respectively, in its corre- _[11] _{C. S. Raghavendra, V. K. Prasanna, J. Kumar, and S.} sponding AOSP graph. In real-life applications, the k is _{Hariri, Reliability analysis in distributed systems. IEEE} typically small whereasÉ_VÉ_{/ É}_EÉ_{is much larger ; thus,}

Trans. Comput. C-37 ( 1988 ) 352 – 358.

our algorithm is a significant improvement over the tradi- _{[12 ]} _{A. Satyanarayana and A. Prabhakar, New topological}

tional approaches. _{formula and rapid algorithm for reliability analysis of}

Many graph-related problems are NP-complete for complex networks. IEEE Trans. Reliab. R-27 ( 2 ) ( 1979 )

general graphs but can be solved in polynomial-time for 82 – 100.

AOSP graphs. In this paper, we have shown that the [13 ] J. Valdes, R. Tarjan, and E. L. Lawler, The recognition

distributed task reliability problem can also be solved in of series parallel digraphs. SIAM J. Comput. 11 ( 1982 ) 298 – 313.