• 沒有找到結果。

Conditional-Fault Diagnosability of Multiprocessor Systems with an Efficient Local Diagnosis Algorithm under the PMC Model

N/A
N/A
Protected

Academic year: 2021

Share "Conditional-Fault Diagnosability of Multiprocessor Systems with an Efficient Local Diagnosis Algorithm under the PMC Model"

Copied!
12
0
0

加載中.... (立即查看全文)

全文

(1)

Conditional-Fault Diagnosability of

Multiprocessor Systems with an Efficient Local

Diagnosis Algorithm under the PMC Model

Cheng-Kuan Lin, Tzu-Liang Kung, and Jimmy J.M. Tan

Abstract—Diagnosis is an essential subject for the reliability of multiprocessor systems. Under the PMC diagnosis model, Dahbura and Masson [12] proposed a polynomial-time algorithm with time complexity OðN2:5Þ to identify all the faulty processors in a system with N processors. In this paper, we present a novel method to diagnose a conditionally faulty system by applying the concept behind the local diagnosis, introduced by Somani and Agarwal [30], and formalized by Hsu and Tan [18]. The goal of local diagnosis is to identify the fault status of any single processor correctly. Under the PMC diagnosis model, we give a sufficient condition to estimate the local diagnosability of a given processor. Furthermore, we propose a helpful structure, called the augmenting star, to efficiently determine the fault status of each processor. For an N-processor system in which every processor has an Oðlog NÞ degree, the time complexity of our algorithm to diagnose any given processor is Oððlog NÞ2Þ, provided that each processor can construct an augmenting star structure of full order in time Oððlog NÞ2Þ and the time for a processor to test another one is constant. Therefore, the time totals to OðNðlog NÞ2Þ for diagnosing the whole system.

Index Terms—Fault diagnosis, PMC model, diagnosability, reliability, diagnosis algorithm.

Ç

1

I

NTRODUCTION

R

ECENTLY,high-speed multiprocessor systems have be-come more and more popular in computer technology. A multiprocessor system consists of processors and com-munication links between processors. The reliability of processors is crucial since even a few malfunctioning processors may lead to a severe system breakdown. Whenever processors are found to be faulty, they should be replaced with fault-free ones as soon as possible to guarantee the system can work properly.

Identifying all the faulty processors in a system is known as system-level diagnosis. Preparata et al. [27] distinguished two types of self-diagnosable systems: one-step diagnosable systems and sequentially diagnosable systems. A system is said to be one-step t-diagnosable if all its faulty processors can be precisely pointed out by one application of a diagnostic process provided that the total number of faulty processors does not exceed t, whereas a system is sequentially t-diagnosable if at least one faulty processor can be identified provided that the total number of faulty processors does not exceed t. In this paper, we focus on one-step diagnosis only. The maximum number of faulty processors that can be correctly identified is an important parameter, known as the one-step diagnosability of a system. In other words, the one-step

diagnosability of a system G is just equal to the maximum integer t such that G can be one-step t-diagnosable.

In practice, some multiprocessor systems are based on an underlying bus structure, or fabric, and are perfectly feasible for a centralized test controller (an independent processor acting as a controller) to check each processor in the system. In such a scheme, the centralized controller itself can be tested externally. Some research is related to the issue of network-on-chip (NoC); for example, Pande et al. [26] developed an evaluation methodology to compare the performance and characteristics of a variety of NoC topologies; Bartic et al. [4] presented an NoC design which is suitable for building networks with irregular topologies. Instead, a self-diagnosable system contains no centralized test controller. In a self-diagnosable system, a testing signal is supposed to be delivered from a processor to another one through the communication bus at one time. Then the system performs self-diagnosis by making each processor act as a tester to test each of the directly connected ones. This paper is concerned with the self-diagnosis.

1.1 Diagnosis Models

The problem of system-level diagnosis has been widely discussed by many researchers [9], [12], [13], [15], [16], [18], [20], [23], [24], [27], [29]. Several well-known approaches have been developed. One classic approach, called the PMC diagnosis model (or PMC model for short), was first proposed by Preparata et al. [27]. This model makes diagnosis by sending a test signal from a processor to another linked one and then receiving a response in the reverse direction. According to the collection of all test outcomes, the fault status of every processor can be identified. The fundamental assumption of the PMC model is that a test outcome is reliable if and only if the testing processor is fault-free. Another diagnostic model, called the BGM model [3], was defined by

. C.-K. Lin and J.J.M. Tan are with the Department of Computer Science, National Chiao Tung University, 1001 University Road, Hsinchu 30010, Taiwan. E-mail: {cklin, jmtan}@cs.nctu.edu.tw.

. T.-L. Kung is with the Department of Computer Science and Information Engineering, Asia University, 500 Lioufeng Road, Wufeng, Taichung 41354, Taiwan. E-mail: [email protected].

Manuscript received 3 June 2010; revised 27 Oct. 2010; accepted 2 Nov. 2010; published online 19 Jan. 2011.

Recommended for acceptance by A. Nayak.

For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TPDS-2010-06-0328. Digital Object Identifier no. 10.1109/TPDS.2011.46.

(2)

Barsi, Grandoni, and Maestrini with asymmetric interpreta-tion of test outcomes reported by faulty units. In the BGM model, any test that is complete for a given class of faults in a unit necessarily consists of a sequence of a large number of stimuli. Supposedly, there must be at least one mismatch between actual and expected reaction to the stimuli whenever the tested unit is faulty, even if the testing unit itself is faulty. In this paper we address the PMC model. Following this model, Hakimi and Amin [16] proved that a system is one-step t-diagnosable if it is t-connected with at least 2t þ 1 nodes. They also posed a sufficient and necessary condition for verifying if a system is one-step t-diagnosable under the PMC model. More practically, Dahbura and Masson [12] presented an OðN2:5Þ diagnosis algorithm to identify all the

faulty processors in a system with N processors. It is noticed that only processors with direct connections are allowed to test each other. In particular, if all the neighboring processors of a processor v are faulty simulta-neously, then it seems unlikely to determine whether processor v is fault-free or faulty. In this way, the one-step diagnosability of a system G is trivially bounded by the minimum degree of G. For most practical systems that are sparsely connected, only a small number of faulty proces-sors can be recognized under the PMC diagnosis model. Therefore, it has long been an intriguing issue to explore some measure that can better reflect fault patterns in a real system. For example, Das et al. [13] investigated fault diagnosis under local constraints; Lai et al. [20] proposed a new measure of diagnosis capability, namely conditional diagnosability, by restricting that for each processor in a system, all its neighboring processors do not fail at the same time. Recently, Xu et al. [31] investigated the conditional diagnosability with respect to a class of matching composi-tion networks. However, these works did not provide any diagnosis algorithm, so it is not clear how to identify faults efficiently in such a situation. In this paper, we will relax that condition imposed in [20], [31] by assuming that every fault-free processor can have at least one fault-free neighbor. Under this assumption, not only can the diagnosis capability be proved theoretically, but also it is guaranteed in an algorithmic point of view.

1.2 The State-of-the-Art

Many previous studies about system-level diagnosis were devoted to the diagnosis capability in a global sense but ignored some local connection. For instance, it is likely to correctly point out all the faulty processors in a t-diagnosable system even when the number of faulty processors has been already greater than t. Consider two hypercube systems Qm

and Qn, which are known to be m-diagnosable and

n-diagnosable [19], respectively, where m and n are two integers with m  n. A new system G can be built by integrating these two systems with a few communication links in some way that makes the new system have one-step diagnosability limited by n. See Fig. 1 for illustration. Consider the following scenario: There are m faulty processors within Qm. Then these m faulty processors can

be correctly identified because Qm is m-diagnosable. Even

though this new system is only n-diagnosable, it is of high probability that the correct diagnosis can be made when the total number of faulty processors is between m and n.

In the last two decades, a variety of methods were developed to achieve system-level diagnosis for various interconnected structures. For example, Chessa and Maestrini [10] introduced a correct and almost complete diagnosis method for square grids. Later, Caruso et al. [6], [7], [8] presented two correct and almost complete diagnosis algorithms, called EDARS and NDA, respec-tively. The two algorithms have time complexity OðkNÞ when applied to k-regular systems of N units. A lower bound to the worst-case diagnosis completeness for regular graphs under the PMC model is shown in [9]. Recently, Ma´nik and Gramatova´ [22], [23] proposed the Boolean formalization of the PMC model for the syn-drome-decoding process. When this approach is applied to regular systems, the computation time of fault diagnosis can be significantly reduced. In addition, Somani and Agarwal [30] developed a distributed diag-nosis algorithm for regular systems based on the concept of local diagnosis. Later, Altmann et al. [2] addressed an event-driven distributed approach to multiprocessor diag-nosis, and Masuyama and Miyoshi [24] presented a nonadaptive distributed system-level diagnosis method for computer networks.

In some circumstances, however, we are only concerned about some substructure of a multiprocessor system, which is implementable in very large-scale integration (VLSI). Such a substructure, for example, can be a ring, a path, a tree, a mesh and so on. If all processors in these substructures can be guaranteed to be fault-free, a proce-dure is still workable even though there are many faulty processors in the remaining part of the system. Thus, the local substructure plays a more critical role than the global fault status of the entire system. Motivated by such a concept, Hsu and Tan [18] presented an elegant measure of diagnosis capability, known as local diagnosability, to identify the one-step diagnosability of a system by comput-ing the local diagnosability with respect to each individual processor. For any processor in a system, two useful structures [18] were presented to determine its local diagnosability under the PMC model. Hence, in this paper, we will extend the previous study and design an efficient diagnosis algorithm based on the proposed structure, named the augmenting star, provided that each fault-free processor has at least one fault-free neighboring processor. In short, our algorithm proceeds depending on the

Fig. 1. An diagnosable system obtained by integrating an n-diagnosable subsystem and m-n-diagnosable subsystem.

(3)

existence of the augmenting star structure. Moreover, the key difference between our work and the others is that we address the conditional-fault identification problem from a standpoint of local diagnosis. For many practical multi-processor systems, the number of links incident to each processor is in the order of log N, where N is the total number of processors. Accordingly, the time for diagnosing any given processor v can be bounded by Oððlog NÞ2Þ if there exists an augmenting star structure rooted at processor v. So all the faulty processors can be identified one by one with time complexity OðNðlog NÞ2Þ, provided that the augmenting star can also be constructed at each processor in time Oððlog NÞ2Þ.

The rest of this paper is organized as follows: Section 2 provides preliminary background for system-level diagnosis and graph-theoretic terminology. Section 3 introduces how to diagnose a system with random faults. A diagnosis algorithm based on the augmenting star structure is presented in Section 4. Some examples are shown in Section 5. Finally, our conclusions are given in Section 6.

2

P

RELIMINARIES

The underlying topology of a multiprocessor system is usually modeled as a graph, whose vertex set and edge set represent the set of all processors and the set of all communication links between processors, respectively. Throughout this paper graphs are finite, simple, and unless specified otherwise, undirected. Some important graph-theoretic definitions and notations will be introduced in advance. For those not defined here, however, we follow the standard terminology given by Bondy and Murty [5].

An undirected graph G is an ordered pair ðV ; EÞ, where V is a nonempty set, and E is a subset of ffu; vg j fu; vg is a 2-element subsets of V g.1The set V is called the vertex

set of G, and the set E is called the edge set of G. For convenience, we denote the vertex set and the edge set of G by V ðGÞ and EðGÞ, respectively. Two vertices, u and v, in graph G are adjacent if fu; vg 2 EðGÞ; we say u is a neighbor of v, and vice versa. The degree of a vertex v in G, denoted by degGðvÞ, is the number of edges incident to v. The

neighborhood of vertex v, denoted by NGðvÞ, is the set of

vertices adjacent to v. For a set S  V , the notation G  S represents the graph obtained by removing every vertex in Sfrom G and deleting those edges incident to at least one vertex in S. A graph H is a subgraph of G if V ðHÞ  V ðGÞ and EðHÞ  EðGÞ. The components of a graph G are its maximal connected subgraphs. A component is trivial if it has no edges; otherwise, it is nontrivial.

In the PMC model [27], adjacent units are capable of performing tests on each other. A testing unit ui specifies

some test sequence to a tested unit uj and receives a

response sequence from uj. The testing unit outputs a test

outcome ai;j¼ 1 if the actual response sequence mismatches

the expected one; otherwise, ai;j¼ 0. Let an undirected

graph G ¼ ðV ; EÞ denote the underlying topology of a multiprocessor system. For any two adjacent vertices u; v2 V , the ordered pair ðu; vÞ represents the test that processor u diagnoses processor v. In this situation, u is a

tester, and v is a testee. The outcome of a test ðu; vÞ is 1 (respectively, 0) if u evaluates v to be faulty (respectively, fault-free). The notation u ! v means that u tests v with outcome . Because the faults considered here are perma-nent, the outcome of a test is reliable if and only if the tester is fault-free. A test assignment for system G is a collection of tests and thus can be modeled as a directed graph T ¼ ðV ; LÞ, where ðu; vÞ 2 L and ðv; uÞ 2 L if and only if fu; vg 2 E. The collection of all test outcomes from the test assignment T is called a syndrome. Formally, a syndrome of T is a mapping  : L ! f0; 1g. The set F of all faulty processors in G is called a faulty set. It is noticed that F can be any subset of V . The process of identifying all faulty vertices is said to be the system-level diagnosis. Further-more, the maximum number of faulty vertices that can be correctly identified in a system G is called the one-step diagnosability of G, denoted by ðGÞ.

For any given syndrome  resulting from a test assignment T ¼ ðV ; LÞ, a subset of vertices F  V is said to be consistent with  if for any ðu; vÞ 2 L with u 2 V  F , then ðu; vÞ ¼ 1 if and only if v 2 F . This corresponds to the assumption that fault-free testers always give correct test results, whereas faulty testers can lead to unreliable results. Therefore, a given set F of faulty vertices may be consistent with different syndromes. Let ðF Þ denote the set of all possible syndromes with which the faulty set F can be consistent. Then two distinct faulty sets F1; F2 V are said to be distinguishable if

ðF1Þ \ ðF2Þ ¼ ;; otherwise, F1 and F2 are said to be

indistinguishable. That is, ðF1; F2Þ is a distinguishable pair

(respectively, an indistinguishable pair) of faulty sets if ðF1Þ \

ðF2Þ ¼ ; (respectively, ðF1Þ \ ðF2Þ 6¼ ;).

Lemma 1 [12].A system G is one-step t-diagnosable if and only if for any two distinct faulty sets F1; F2 V ðGÞ with jF1j  t

and jF2j  t, ðF1; F2Þ is a distinguishable pair.

Let F1; F2 be two distinct sets, and let F14F2¼

ðF1 F2Þ [ ðF2 F1Þ denote the symmetric difference between

F1and F2. Dahbura and Masson [12] presented a sufficient

and necessary characterization of one-step t-diagnosable systems and exploited it to design a polynomial-time algorithm for identifying the set of faulty processors. Lemma 2 [12].Let G ¼ ðV ; EÞ be a graph. For any two distinct

faulty sets F1; F2 V , ðF1; F2Þ is a distinguishable pair if and

only if there exists a vertex u 2 V  ðF1[ F2Þ and a vertex

v2 F14F2such that fu; vg 2 E.

3

R

ANDOM

-F

AULT

D

IAGNOSIS

For a multiprocessor system, the random-fault model assumes that the probabilities of processor failures are identical and independent. Let v be any vertex in a graph G. It is intuitive to observe that ðNGðvÞ; fvg [ NGðvÞÞ forms an

indistinguishable pair of faulty sets. That is, the conven-tional one-step diagnosability is mainly concerned with the global status of a system under the random-fault model. Instead, Hsu and Tan [18] turned their attention to the local connective substructure in a system. More precisely, given any single vertex v in a graph it is only required to determine whether v is faulty or not. The following concept is proposed in [18].

1. We denote fu; vg and ðu; vÞ an undirected edge and a directed edge from u to v, respectively.

(4)

Definition 1 [18].Let G be a graph and v denote any one of its vertices. Then G is locally t-diagnosable at vertex v if, given a syndrome F produced by a set of faulty vertices F  V with

v2 F and jF j  t, every faulty set of at most t vertices that is also consistent with F must contain vertex v.

By Definition 1, Hsu and Tan [18] further proved that a graph G is locally t-diagnosable at vertex v if and only if for any two distinct sets of vertices F1; F2 V ðGÞ such that

jF1j; jF2j  t and v 2 F14F2, ðF1; F2Þ is a distinguishable

pair. It was also shown that a graph G is one-step t-diagnosable if and only if it is locally t-t-diagnosable at every vertex. Moreover, the local diagnosability of a vertex v in G, denoted by GðvÞ, is defined to be the maximum integer of t

such that G is locally t-diagnosable at vertex v. The relationship between one-step diagnosability and local diagnosability is revealed in the next lemma.

Lemma 3 [18]. Let G denote the underlying topology of a multiprocessor system. Then ðGÞ ¼ minfGðvÞ j v 2 V ðGÞg.

In [18], the following structure is presented to compute the local diagnosability with respect to any given vertex under the PMC model.

Definition 2 [18].Letting G ¼ ðV ; EÞ be a graph, v 2 V be any vertex, and k be an integer greater than or equal to 1, an extending star of order k rooted at vertex v is defined to be the subgraph of G, denoted by TTGðv; kÞ ¼ ðV ðv; kÞ; Eðv; kÞÞ,

w h e r e Vðv; kÞ ¼ fvg [ fuijj 1  i  2; 1  j  kg a n d

Eðv; kÞ ¼ ffv; u1jg; fu1j; u2jg j 1  j  kg. An extending

star of order k is said to be of full order if k ¼ degGðvÞ. See

Fig. 2 for illustration.

In practice, it is more applicable to have an efficient procedure that is capable of identifying the fault status of a given vertex. Based on the extending star, a polynomial-time algorithm, namely Diagnose-Vertex-In-Random-Faults (DVRF, abbreviated for short), is proposed to determine whether any given vertex is faulty or not.

Algorithm. DVRF(G; v)

Input:Any vertex v in a graph G, in which there exists an extending star of full order rooted at v.

Output:The fault status of vertex v. As a convention, the algorithm output is 0 or 1 if vertex v is fault-free or faulty, respectively.

BEGIN

1) t degGðvÞ.

2) Construct an extending star of order t rooted at vertex v, TTGðv; tÞ, as illustrated in Fig. 2. 3) n0 jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð0; 0Þgj n1 jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð0; 1Þgj 4) if n0 n1 then return0 else return1 END

Theorem 1. Let G be a graph, v 2 V ðGÞ, and t ¼ degGðvÞ.

Suppose that there exists an extending star of full order rooted at vertex v, TTGðv; tÞ. Then the algorithm DVRF(G; v)

correctly identifies the fault status of vertex v if the total number of faulty vertices in TTGðv; tÞ does not exceed t.

Proof.Let n0¼ jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð0; 0Þgj; n1¼ jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð0; 1Þgj; n2¼ jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð1; 0Þgj; and n3¼ jf1  j  t j ððu2j; u1jÞ; ðu1j; vÞÞ ¼ ð1; 1Þgj: Obviously, we have t ¼ n0þ n1þ n2þ n3.

First, we consider the case that vertex v is faulty. Suppose, by contradiction, that n0 n1. Then the total

number of faulty vertices in TTGðv; tÞ amounts to at least

2n0þ n2þ n3þ 1  n0þ n1þ n2þ n3þ 1 ¼ t þ 1. T h i s

contradicts the assumption that the total number of faulty vertices in TTGðv; tÞ does not exceed t. Hence, n0is

strictly less than n1, and the proposed algorithm outputs

a correct diagnosis result.

Next, we consider the case that vertex v is fault-free. Again, we assume, by contradiction, that n0< n1. Then

the total number of faulty vertices in TTGðv; tÞ amounts to

at least 2n1þ n2þ n3þ 1  n0þ n1þ n2þ n3þ 1 ¼ t þ 1,

contradicting the assumption that the total number of faulty vertices in TTGðv; tÞ does not exceed t. Hence, n0

needs to be greater than or equal to n1, and the proposed

algorithm correctly diagnoses the given vertex v. Therefore, the proof is completed. tu The extending star structure can be constructed in many multiprocessor systems and interconnection networks, such as hypercubes [28], crossed cubes [14], augmented cubes [11], star graphs [1], etc. Among various kinds of network topologies, the hypercube is one of the most popular networks for parallel and distributed computation. Not only is it ideally suited to both special-purpose and general-purpose tasks, but it can efficiently simulate many other networks [21]. Hence, we describe here how to construct an extending star of full order in the hypercube.

Let v ¼ bn. . . bi. . . b1 be an n-bit binary string. For

1 i  n, we use ðvÞi to denote the binary string bn. . . bi. . . b1. Moreover, we use ½vi to denote the ith bit bi

of v. The n-dimensional hypercube (or n-cube for short), denoted by Qn, consists of 2n vertices and n2n1 edges.

Each vertex corresponds to an n-bit binary string. Two vertices, u and v, are adjacent if and only if v ¼ ðuÞi for

Fig. 2. The extending star TTGðv; kÞ consists of 2k þ 1 vertices and 2k edges.

(5)

some i. An n-cube can be constructed recursively. Let Qð0Þn

and Qð1Þn denote two subgraphs of Qn induced by vertex

subsets fv 2 V ðQnÞ j ½vn¼ 0g and fv 2 V ðQnÞ j ½vn¼ 1g,

respectively. For n  2, Qð0Þn and Q ð1Þ

n are isomorphic to

Qn1. Then an extending star of full order rooted at any

vertex v in the n-cube can be formed by the graph TTQnðv; nÞ, whose vertex set and edge set are fv; ðvÞ

1

; ððvÞ1Þ2;ðvÞn;ððvÞnÞ1g [Sn1i¼2fðvÞi; ððvÞiÞng and ffv; ðvÞ1g; fðvÞ1;ððvÞ1Þ2g; fv; ðvÞng; fðvÞn; ððvÞnÞ1gg [Sn1i¼2ffv; ðvÞig; fðvÞi;ððvÞiÞngg, respectively. See Fig. 3 for illustration.

We now measure the time complexity of the proposed algorithm. For most of the practical systems G with N vertices, the degree of each vertex is in the order of log N, and the extending star structure of full order can be constructed in time Oðlog NÞ. For example, both the n-cube and n-dimen-sional crossed cube have N ¼ 2nvertices, and the degree of

each vertex is n ¼ log N. Under the PMC model we assume that the time for a vertex to test another one is a constant c. Given an extending star TTGðv; nÞ rooted at a vertex v in

system G, the time needed for determining the fault status of vertex v is 2c log N ¼ Oðlog NÞ. As a result, the total time for diagnosing the whole system G is OðN log NÞ.

4

C

ONDITIONAL

-F

AULT

D

IAGNOSIS

The underlying topologies of many multiprocessor systems are usually regular and even vertex-symmetric. By defini-tion [1], a graph is vertex-symmetric if for every pair u; v of vertices, there exists an automorphism of the graph that maps u into v.

Consider a vertex-symmetric graph G with one-step diagnosability ðGÞ ¼ t; so G is one-step t-diagnosable but not ðt þ 1Þ-diagnosable. However, the only case that stops it from being ðt þ 1Þ-diagnosable is usually that there exists a vertex v whose neighbors are all faulty simultaneously. For example, members in the cube family are so. A system is known to be strongly diagnosable if it is one-step t-diagnosable and can achieve ðt þ 1Þ-diagnosability, except for the case where a node’s neighbor are all faulty. Recently, Hsieh and Chuang [17] studied the strong diagnosability of regular networks and product networks under the PMC model. We are, however, led to the following question: How large can the maximum value of t be such that G remains t-diagnosable under the additional condition that every fault-free vertex has at least one fault-free neighbor?

For a classical measurement of diagnosis capability, it is usually assumed that processor failures are statistically independent. It does not reflect the total number of processors in the system and the probabilities of processor failures. Najjar and Gaudiot [25] proposed the network resilience as the maximum number of failures that can be sustained while the network remains connected with a reasonably high probability. For a hypercube, the fault resilience is shown to be 25 percent for the four-dimensional hypercube Q4, and it increases to 33 percent for the

10-dimensional hypercube Q10. More particularly, the

10-dimensional hypercube Q10 still remains connected with a

probability higher than 0.99 even when 33 percent of its processors fail. They also drew a conclusion that large-scale systems with a constant degree are more susceptible to failures by disconnection than smaller networks. Intuitively, a connected network should have better diagnosis capability. Let G be a graph. A set F  V ðGÞ is called conditionally faulty if NGðvÞ 6 F for every vertex v 2 V ðGÞ  F . A graph

is conditionally faulty if its faulty vertices form a condition-ally faulty set. Furthermore, G is said to be conditioncondition-ally t-diagnosable if for any two conditionally faulty sets F1; F2

VððGÞ with F16¼ F2 and jF1j; jF2j  t, ðF1; F2Þ is a

distin-guishable pair. We propose the following concept.

Definition 3. Let G be a graph and v denote any vertex in G. Then G is conditionally t-diagnosable locally at vertex v if, given a syndrome F produced by any conditionally faulty set

of vertices F  V ðGÞ with v 2 F and jF j  t, the vertex v must be an element of every conditionally faulty set of at most tvertices that is consistent with F.

The following theorem is another standpoint for char-acterizing whether a system is conditionally t-diagnosable locally at its vertex v.

Theorem 2.A graph G is conditionally t-diagnosable locally at vertex v 2 V ðGÞ if F1and F2 form a distinguishable pair for

any two conditionally faulty sets F1; F2 V ðGÞ such that

F16¼ F2, v 2 F14F2, jF1j  t, and jF2j  t.

Proof. Let S1 V ðGÞ be any conditionally faulty set with

jS1j  t and v 2 S1. Furthermore, let S2 V ðGÞ denote

any conditionally faulty set with jS2j  t and v 62 S2.

Suppose that any two distinct conditionally faulty sets F1; F2 of G, with v 2 F14F2 and jF1j; jF2j  t, form a

distinguishable pair. Then we have ðS1Þ \ ðS2Þ ¼ ;.

Thus, S2is not consistent with any syndrome in ðS1Þ. It

follows from contraposition that any conditionally faulty set X  V ðGÞ, which has at most t elements and can be consistent with a syndrome in ðS1Þ, must contain vertex

v. By Definition 3, G is conditionally t-diagnosable locally at vertex v. tu The edge-degree of an edge fu; vg in a graph G, denoted by Gðfu; vgÞ, is the number of distinct vertices of V ðGÞ  fu; vg

adjacent to u or v. For any vertex v 2 V ðGÞ, let GðvÞ ¼

minfGðfu; vgÞ j fu; vg 2 EðGÞg denote the minimum

edge-degree of all the edges incident to vertex v.

Theorem 3.Let G be a graph, v 2 V ðGÞ denote a vertex, and t be any positive integer less than or equal to GðvÞ þ 1. Then G is

conditionally t-diagnosable locally at vertex v if for every

Fig. 3. An extending star of full order rooted at a vertex v2 V ðQð0Þ n Þ.

(6)

conditionally faulty set F  V ðGÞ  fvg with 0  jF j  t 1, the connected component of G  F , which contains vertex v, either has at least 2ðt  jF jÞ þ 1 vertices or consists of only two vertices that are adjacent to each other.

Proof.We prove the sufficiency by contradiction. Suppose that G is not conditionally t-diagnosable locally at vertex v if the sufficient condition holds. By Theorem 2, there exists an indistinguishable pair of conditionally faulty sets ðF1; F2Þ, where F16¼ F2, jF1j  t, jF2j  t, and

v2 F14F2. It follows from Lemma 2 that there is no

edge between V ðGÞ  ðF1[ F2Þ and F14F2.

Let F ¼ F1\ F2 and p ¼ jF j. Because both F1 and F2

are conditionally faulty, F is conditionally faulty too. Moreover, we have 0  p  t  1 and v 62 F . Hence, F14F2 is disconnected from other parts after removing

all the vertices in F from G. We observe that jF14F2j 

2ðt  pÞ. Thus, the connected component Cv of G  F ,

which contains vertex v, has at most 2ðt  pÞ vertices. It is noticed that jV ðCvÞj  2 since NGðvÞ 6 F . Hence, we

distinguish the following two cases.

Case 1: Suppose that jV ðCvÞj  3. Since jV ðCvÞj 

jF14F2j  2ðt  pÞ, this contradicts the assumption that

the component Cvhas at least 2ðt  pÞ þ 1 vertices.

Case 2: Suppose that jV ðCvÞj ¼ 2. For convenience, let

VðCvÞ ¼ fu; vg. Without loss of generality, we assume

that v 2 F1. It is easy to see that NGðvÞ [ NGðuÞ 

fu; vg  F . Thus, we have jF j  Gðfu; vgÞ. However,

because F1is conditionally faulty, vertex u is also in F1.

Hence, we have jF1j  jF [ fu; vgj ¼ jF j þ 2  Gðfu;

vgÞ þ 2, contradicting the assumption that jF1j  t 

GðvÞ þ 1  Gðfu; vgÞ þ 1.

By contradiction, G is really conditionally t-diagno-sable locally at vertex v. tu We now propose a helpful structure, called augmenting star, to identify whether a given vertex is fault-free in a conditionally faulty system.

Definition 4. Letting G ¼ ðV ; EÞ be a graph, v 2 V be any vertex, and k be an integer greater than or equal to 2, an augmenting star of order k rooted at vertex v is defined to be the subgraph AAGðv; kÞ ¼ ðV ðv; kÞ; Eðv; kÞÞ of G, where

Vðv; kÞ ¼ fvg [ fuij 1  i  kg [ fxi;j; yi;j; zi;jj 1  i  k ;

1 j  k  1g and

Eðv; kÞ ¼ ffv; uig j 1  i  jg [ ffui; xi;jg j 1  i  k;

1 j  k  1g [ ffxi;j; yi;jg; fyi;j; zi;jg j 1  i  k;

1 j  k  1g:

An augmenting star of order k is said to be of full order if k¼ degGðvÞ. See Fig. 4 for illustration. For any 1  i  k, the

subgraph of AAGðv; kÞ induced by the vertex set fui; xi;j;

yi;j; zi;jj 1  j  k  1g is denoted by AAðiÞGðv; kÞ.

Theorem 4. Let G be a graph and v 2 V ðGÞ denote a vertex. Suppose that the degree t of vertex v is at least 2; i.e., t  2. Then G is conditionally ð2t  1Þ-diagnosable locally at vertex v if it contains an augmenting star of full order rooted at vertex v as a subgraph.

Proof.Suppose that G contains an augmenting star of full order rooted at v, AAGðv; tÞ, as a subgraph. Then it is easy

to see that GðvÞ  2t  2. Therefore, we can apply

Theorem 3 to prove the result.

Let F  V ðGÞ  fvg be a conditionally faulty set of p vertices for 0  p  2t  2. Then we have to show that the connected component Cv of G  F , which contains

vertex v, either has at least 2ð2t  1  pÞ þ 1 ¼ 4t  2p  1 vertices or consists of only two adjacent vertices. For convenience, let r ¼fu1; u2; . . . ; utg \ F



. Because F is conditionally faulty and v 62 F , we have 0  r  minft  1; pg. Without loss of generality, we can assume that furþ1; . . . ; utg \ F ¼ ;. Let Li;j¼ fui; xi;j; yi;j; zi;jg for

1 i  t and 1  j  t  1. Hence, there are at least ðt  rÞðt  1Þ  ðp  rÞ Li;j’s with Li;j\ F ¼ ;. Thus, Cv

has at least 3½ðt  rÞðt  1Þ  ðp  rÞ þ ðt  rÞ þ 1 ver-tices. Comparing 3½ðt  rÞðt  1Þ  ðp  rÞ þ ðt  rÞ þ 1 with 4t  2p  1, we set

def¼f3½ðt  rÞðt  1Þ  ðp  rÞ þ ðt  rÞ þ 1g  ð4t  2p  1Þ: ð1Þ First, we assume that 0  p  2t  3 or r  t  2. If 0 p  2t  3, then

¼ 3ðt  rÞðt  1Þ  3t þ 2r  p þ 2  3ðt  rÞðt  1Þ  3t þ 2r  ð2t  3Þ þ 2 ¼ ðt  r  1Þð3t  5Þ  0;

i f r  t  2, t h e n   3ðt  rÞðt  1Þ  5t þ 2r þ 4 ¼ ðt  r  2Þð3t  2Þ þ 3r  0. Thus, Cv has at least 4t 

2p 1 vertices.

Second, we consider that p ¼ 2t  2 and r ¼ t  1. It is noticed that u1; u2; . . . ; ut12 F and ut62 F . Then we

distinguish the following two cases.

Case 1: Suppose that NGðutÞ  fvg  F . Obviously,

we have F ¼St1k¼1fuk; xt;kg. Hence, Cvcontains only two

adjacent vertices v and ut.

Case 2: Suppose that NGðutÞ  fvg 6 F . Accordingly,

Cv has at least three vertices, i.e., jV ðCvÞj  3 ¼

4t 2p  1 ¼ 4t  2ð2t  2Þ  1.

Hence, the theorem holds. tu Using the augmenting star structure we can design an efficient algorithm, namely Diagnose-The-Given-Vertex-In-Conditional-Faults (DVCF, for short), to diagnose any vertex in a conditionally faulty system.

Algorithm. DVCF(G; v)

Input:Any vertex v in a conditionally faulty graph G, in which there exists an augmenting star of full order rooted at v.

Fig. 4. The augmenting star AAGðv; kÞ consists of 3k2 2k þ 1 vertices and 3k2 2k edges.

(7)

Output:The fault status of vertex v. As a convention, the algorithm output is 0 or 1 if vertex v is fault-free or faulty, respectively.

BEGIN

1) t degGðvÞ.

2) Construct an augmenting star of order t rooted at v, AAGðv; tÞ, as illustrated in Fig. 4.

3) S S1itfuij DVRFðAAðiÞGðv; tÞ; uiÞ outputs 0g

mi;0 jf1  j  t  1 j ððyi;j; xi;jÞ; ðxi;j; uiÞÞ ¼ ð0; 0Þgj

mi;1 jf1  j  t  1 j ððyi;j; xi;jÞ; ðxi;j; uiÞÞ ¼ ð0; 1Þgj

mi;2 jf1  j  t  1 j ððyi;j; xi;jÞ; ðxi;j; uiÞÞ ¼ ð1; 0Þgj

mi;3 jf1  j  t  1 j ððyi;j; xi;jÞ; ðxi;j; uiÞÞ ¼ ð1; 1Þgj

4) if jSj  3 then n0 jfw 2 S j ðw; vÞ ¼ 0gj n1 jfw 2 S j ðw; vÞ ¼ 1gj if n0> n1 then return0 else return1 5) if jSj ¼ 2 thenlet up; uq 2 S if mp;0 mp;1 mq;0 mq;1

then return ðup; vÞ

else return ðuq; vÞ

6) if jSj ¼ 1

thenlet up2 S and return ðup; vÞ

7) if jSj ¼ 0

then if mi;1 mi;0 2 for every 1  i  t

then return1

elselet p be an integer such that mp;1 mp;0is

equal to 1

r jf1  j  t  1 j ððzp;j; yp;jÞ,

ðyp;j; xp;jÞ; ðxp;j; upÞÞ ¼ ð1; 0; 1Þgj

if r  1

then return ðup; vÞ

else return1 END

Theorem 5. Let G be a conditionally faulty graph, v 2 V ðGÞ denote any vertex, and t ¼ degGðvÞ. Suppose that there exists

an augmenting star of full order rooted at vertex v, AAGðv; tÞ.

Then the proposed algorithm DVCF(G; v) can identify the fault status of vertex v correctly if t  4 and the total number of faulty vertices in AAGðv; tÞ does not exceed 2t  1.

Proof. Let fu1; u2; . . . ; utg denote the set of neighbors of

vertex v and S  fu1; u2; . . . ; utg be the set used in

step (3) of the proposed algorithm. For convenience, we denote the set of all faulty vertices in AAGðv; tÞ by F .

Let A and B denote two subsets of neighbors of vertex v as follows:

A¼ [

1it

fuij DVRFðAAðiÞGðv; tÞ; uiÞ outputs 0g

! \ F B¼ [

1it

fuij DVRFðAAðiÞGðv; tÞ; uiÞ outputs 1g

!  F First of all, we claim that jAj þ jBj  1. By Theorem 1, the algorithm DVRF(AAðiÞGðv; tÞ; ui) correctly identifies the

faulty/fault-free status of vertex ui, 1  i  t, in AAðiÞGðv; tÞ

if the number of faulty vertices in AAðiÞGðv; tÞ does not exceed t  1. We assume, by contradiction, that jAj þ jBj  2. Then we have jF j  tjAj þ tjBj ¼ ðjAj þ jBjÞt  2t, contradicting the condition that jF j  2t  1. Hence, the claim holds.

We now consider the following four cases according to the number of vertices in S. For convenience, we use fðHÞ to denote the number of faulty vertices in a graph H.

Case 1:Suppose that jSj  3. Since jAj þ jBj  1, at most one vertex in S is likely to be faulty. Thus, more than half of the vertices in S can correctly diagnose vertex v. Let n0¼

jfw 2 S j ðw; vÞ ¼ 0gj and n1¼ jfw 2 S j ðw; vÞ ¼ 1gj.

Then vertex v is fault-free if and only if n0> n1.

Case 2: Suppose that jSj ¼ 2. Let S ¼ fup; uqg with

some 1  p; q  t. Then we claim that up is fault-free if

mp;0 mp;1 mq;0 mq;1. Suppose, by contradiction, that

upis faulty. Moreover, because at most one vertex in S is

likely to be faulty, vertex uqhas to be fault-free. We further

claim that mq;1¼ 0 and mq;2þ mq;3 1. We assume, by

contradiction, that mq;1 1 or mq;2þ mq;3 2.

Accord-ingly, the number of faulty vertices can be counted as follows: jF j  jfui j 1  i  t; i 62 fp; qggj þ fðAAðpÞG ðv; tÞÞ þ fðAAðqÞGðv; tÞÞ  ðt  2Þ þ t þ ð2mq;1þ mq;2þ mq;3Þ  ðt  2Þ þ t þ 2 ¼ 2t;

which contradicts the assumption that jF j  2t  1. Hence, the claim of mq;1¼ 0 and mq;2þ mq;3 1 is true.

Since mq;0þ mq;1þ mq;2þ mq;3¼ t  1, we have mp;0

mp;1 mq;0 mq;1¼ mq;0 t  2  2 for t  4. As a result,

the number of faulty vertices is estimated as follows: jF j  jfuij 1  i  t; i 62 fp; qggj þ fðAAðpÞGðv; tÞÞ  ðt  2Þ þ ð1 þ 2mp;0þ mp;2þ mp;3Þ ¼ ðt  2Þ þ ð1 þ mp;0þ mp;0þ mp;2þ mp;3Þ  ðt  2Þ þ ð1 þ mp;1þ 2 þ mp;0þ mp;2þ mp;3Þ ¼ ðt  2Þ þ ð3 þ t  1Þ ¼ 2t;

which contradicts the requirement of jF j  2t  1. Such a contradiction results from the original assumption that upis faulty. In other words, vertex upis really fault-free

and able to make a correct diagnosis if mp;0 mp;1

mq;0 mq;1.

Case 3:Suppose that jSj ¼ 1. Let S ¼ fupg with some

1 p  t. Then we claim that vertex up is fault-free.

Suppose, by contradiction, that up is faulty; i.e., up2 A.

However, Theorem 1 ensures that the algorithm DVRF(AAðpÞGðv; tÞ; up) correctly identifies the fault status of

vertex upin AAðpÞG ðv; tÞ if the number of faulty vertices in

AAðpÞG ðv; tÞ does not exceed t  1. Therefore, the set of faulty vertices in AAðpÞGðv; tÞ, denoted by X, has cardinality at least t. Furthermore, since jAj þ jBj  1, the remaining t  1 neighbors of vertex v (i.e., ui; 1 i 6¼ p  t) are faulty too.

(8)

needs to have at least one fault-free neighbor. As a consequence, vertex v is also faulty. In short, we have fvg [ NGðvÞ [ X  F s o t h a t jF j  jfvg [ NGðvÞ [

Xj ¼ jfvgj þ jNGðvÞ [ Xj  1 þ t þ ðt  1Þ ¼ 2t. A g a i n ,

this contradicts the assumption that jF j  2t  1; that is, the claim holds.

Case 4: Suppose that jSj ¼ 0. Obviously, we have mi;1 mi;0 1 for every 1  i  t. In this case we first

claim that for any 1  i  t, vertex ui is faulty if

mi;1 mi;0 2. Suppose, by contradiction, that ui is

fault-free. Since jAj þ jBj  1, ui is the only fault-free

neighbor of v. Accordingly, the number of faulty vertices can be counted as follows:

jF j  jfujj 1  j 6¼ i  tgj þ fðAAðiÞGðv; tÞÞ

 ðt  1Þ þ 2mi;1þ mi;2þ mi;3

 ðt  1Þ þ ð2 þ mi;0Þ þ mi;1þ mi;2þ mi;3

¼ ðt  1Þ þ 2 þ ðt  1Þ ¼ 2t;

which contradicts the assumption that jF j  2t  1. So vertex ui is faulty if mi;1 mi;0 2. With this claim, it is

easy to see that vertex v is faulty if mi;1 mi;0 2 for

every 1  i  t.

Suppose that there exists an integer p, 1  p  t, such that mp;1 mp;0¼ 1. Let r be an integer defined

as follows:

r¼ jf1  j  t  1 j ððzp;j; yp;jÞ; ðyp;j; xp;jÞ; ðxp;j; upÞÞ

¼ ð1; 0; 1Þgj:

We claim that vertex up happens to be fault-free

(respectively, faulty) if r  1 (respectively, r ¼ 0). Subcase 4.1:Assume that r  1. Suppose, by contra-diction, that upis faulty. If uiis faulty for every 1  i  t,

then v has to be faulty. Accordingly, the total number of faulty vertices can be counted as follows:

jF j  jfuij 1  i 6¼ p  tg [ fvgj þ fðAAðpÞGðv; tÞÞ

 t þ ð1 þ 2mp;0þ mp;2þ mp;3þ 1Þ

¼ t þ mp;0þ mp;1þ mp;2þ mp;3þ 1

¼ t þ ðt  1Þ þ 1 ¼ 2t;

which contradicts the assumption that jF j  2t  1. On the other hand, if v has a fault-free neighbor, say uq

(q 6¼ p), then the number of faulty vertices is counted as follows:

jF j  jfuij 1  i 6¼ q  tgj þ fðAAðqÞGðv; tÞÞ þ fðAA ðpÞ Gðv; tÞÞ

 ðt  1Þ þ t þ ð2mp;0þ mp;2þ mp;3þ 1Þ

 2t;

which contradicts the assumption that jF j  2t  1. Anyway, vertex up is fault-free and can be an adequate

tester if r  1.

Subcase 4.2:Assume that r ¼ 0. Suppose, by contra-diction, that up is fault-free. Then the total number of

faulty vertices can be counted as follows:

jF j  jfuij 1  i 6¼ p  tgj þ fðAAðpÞG ðv; tÞÞ

 ðt  1Þ þ 3mp;1þ mp;2þ mp;3

¼ ðt  1Þ þ 1 þ mp;0þ 2mp;1þ mp;2þ mp;3

¼ ðt  1Þ þ t þ mp;1

 2t;

which contradicts the assumption that jF j  2t  1. That is, vertex up is faulty, and so is vertex v if r ¼ 0.

The proof is completed. tu We end with estimating the time complexity of the proposed algorithm. As described in Section 3, many interconnected systems with N vertices have degree in the order of log N for each vertex. In these systems, an augmenting star structure can be constructed with time complexity Oððlog NÞ2Þ. However, for some unstructured networks, such as the ad hoc network, it is possible to build such a structure with a greater time complexity. Given an augmenting star AAGðv; nÞ rooted at a vertex v in a system G,

the time taken in step (3) of the algorithm is Oððlog NÞ2Þ, because the time complexity of the algorithm DVRF is Oðlog NÞ for AAðiÞGðv; nÞ, and it is run Oðlog NÞ times. As a result, the time complexity of DVCF algorithm is Oððlog NÞ2Þ when an augmenting star of full order is obtained in time Oððlog NÞ2Þ. Based on the symmetry of most practical multiprocessor systems, the time for system-level diagnosis is OðNðlog NÞ2Þ.

5

E

XAMPLES

In this section, we show the proposed diagnosis algorithm can be applied to some well-known multiprocessor inter-connected systems.

5.1 Construction of Augmenting Stars

As the first example, we show that the star graph [1] with dimension of five or more contains an augmenting star structure of full order rooted at each vertex as a subgraph. Let nbe a positive integer. The n-dimensional star graph, denoted by SSn, is a graph whose vertex set consists of all

permuta-tions of f1; 2; . . . ; ng. Each vertex is uniquely assigned a permutation x1x2. . . xn and is adjacent to ðn  1Þ vertices

xix2. . . xi1x1xiþ1. . . xn for 2  i  n, which are vertices

obtained by a transposition of the first digit with the ith one. Consequently, there are n! vertices in an n-dimensional star graph, and each vertex has degree n  1.

For clarity, we use boldface letters to denote vertices of a star graph. For any vertex v 2 V ðSSnÞ, its i-neighbor, denoted

by ðvÞi, is just the vertex obtained by a transposition of the first digit with the ith one of vertex v. For convenience of description, we say that vertices v and ðvÞi are adjacent to each other with a (1i) edge.

To construct an augmenting star of full order rooted at any vertex v in SSn, we need to use a topological property of

star graphs. For any 1  i  n, let Vi denote a subset of

permutations of f1; 2; . . . ; ng, whose elements have symbol i in the nth digit. Clearly, we have V ðSSnÞ ¼Sni¼1Vi.

More-over, it is shown in [1] that the subgraph of SSninduced by

Viis isomorphic to an ðn  1Þ-dimensional star graph SSn1.

(9)

As usual, we use AASSnðv; n  1Þ to denote an augmenting

star structure of full order rooted at any given vertex v in SSn. We depict AASS5ðv; 4Þ in Fig. 5. In order to construct an

augmenting star structure in SSnfor n  5, we propose the

following algorithm (see Fig. 6 for illustration).

Algorithm. Construct-Augmenting-Star-Of-Full-Order-In-Star-Graph(SSn; x1x2. . . xn)

Input:An n-dimensional star graph SSn, n  5, and its any

vertex v ¼ x1x2. . . xn.

Output:An augmenting star of full order rooted at vertex v¼ x1x2. . . xnin SSn.

BEGIN 1) if n < 5

thenerror “the dimensionality is illegal” 2) ðV ; EÞ ðfvg; ;Þ

3) if n ¼ 5

then for i 2 to n do u ðvÞi

ðV ; EÞ ðV [ fug; E [ ffv; uggÞ for j 2 to n do if j 6¼ i then for k 0 to 2 do if j þ k  6 then w ðuÞjþk4 else w ðuÞjþk ðV ; EÞ ðV [ fwg; E [ ffu; wggÞ u w

returnthe graph G ðV ; EÞ (see Fig. 5 for illustration) 4) ðV ; EÞ Construct-Augmenting-Star-Of-Full-Order-In-Star-Graph(SSfxng n ; x1x2. . . xn) 5) for i 2 to n  1 do u ðvÞi

V V [ fðuÞn;ððuÞnÞ2;ðððuÞnÞ2Þ3g

E E [ ffu; ðuÞng; fðuÞn;ððuÞnÞ2g; fððuÞnÞ2, ðððuÞnÞ2Þ3gg

6) ðV ; EÞ ðV [ fðvÞng; E [ ffv; ðvÞnggÞ 7) for i 2 to n  1

do u ðvÞn

Fig. 5. An augmenting star AASS

5ðv; 4Þ rooted at any vertex v in SS5.

Fig. 6. An augmenting star AASS

(10)

for k 0 to 2 do if i þ k  n

then w ðuÞiþknþ2 else w ðuÞiþk

ðV ; EÞ ðV [ fwg; E [ ffu; wggÞ u w

8) return the graph G ðV ; EÞ (see Fig. 6 for illustration) END

As another example, we show that the pancake graph with dimension of 5 or more also contains an augmenting star of full order rooted at any vertex. The n-dimensional pancake graph, denoted by IPn, has the same vertex set as an

n-dimensional star graph, i.e., all permutations of f1; 2; . . . ; ng. Its adjacency is defined as follows: vertex x1x2. . . xi. . . xnis adjacent to vertex y1y2. . . yi. . . ynthrough

an i-dimensional edge, 2  i  n, if yj¼ xijþ1 for all 1 

j i and yj¼ xj for all i < j  n. Because the pancake

graph is algebraically similar to the star graph, an augmenting star of full order can be constructed in a way the same as that for the star graph, except for the case of n¼ 5. For this reason, we only depict an augmenting star of full order in the five-dimensional pancake graph. See Fig. 7. 5.2 Example of DVCF

Now we give an example of DVCF algorithm. Suppose that F1¼ f12345; 32145; 42315; 52341; 41325; 51342; 15342g

is a set of seven faulty vertices in SS5, and we are required to

identify the fault status of vertex 12345. The test assignment for AASS5ð12345; 4Þ and its syndrome is illustrated in Fig. 8. In

step 2 of DVCF algorithm, the method presented in the above section can be applied to obtain an augmenting star of full order rooted at vertex 12345. Next, in step 3, S ¼ f21345g is computed. Since jSj ¼ 1, the procedure will goto step 6 and return the test outcome ð21345; 12345Þ ¼ 1 as its diagnosis output. That is, vertex 12345 is faulty.

In another case, we assume that

F2¼ f32145; 42315; 52341; 31245; 41325; 51342; 15342g

is a set of seven faulty vertices in SS5. Again, we would like

to identify the fault status of vertex 12345. The test assignment for AASS5ð12345; 4Þ and its syndrome is

illu-strated in Fig. 9. Now, in step (3), S ¼ ; is determined. Since jSj ¼ 0, the procedure will goto step (7). Accordingly, we have up¼ 21345 and r ¼ 1 so that the test outcome

ð21345; 12345Þ ¼ 0 is returned. That is, vertex 12345 is fault-free.

5.3 Simulation

Our simulation is aimed at measuring the time consuming of DVCF algorithm over the star graphs and the pancake graphs of different sizes. Because both the two graphs, SSn

and IPn, are vertex-symmetric and ðn  1Þ-regular, we

simulate the diagnosis process with respect to vertex wn¼ 12 . . . n. We carry out a round of simulation by

randomly assigning a conditionally faulty set of 2ðn  1Þ  1¼ 2n  3 vertices in the augmenting star structure rooted at wn for 10,000 times and compute the average time for

identifying the fault status of wn. Then such a round of

simulation will be repeated 30 times to obtain the overall average. The hardware and software configuration include:

1. Intel Core 2 Quad CPU Q8300 2.5 GHz, 2. 4 GB RAM,

3. 64-bit Windows 7 OS, and

4. C++ Programming Language in Microsoft Visual Studio 2005.

The experimental results are shown in Fig. 10.

6

C

ONCLUDING

R

EMARKS

The issue of identifying faulty processors is important for the design of multiprocessor interconnected systems, which are implementable with VLSI. The process of identifying all the faulty processors is the system-level diagnosis. In the random-fault probabilistic model of multiprocessor sys-tems, processors are assumed to fail independently. Hence, the one-step diagnosability of a multiprocessor system is always upper bounded by its minimum degree. For many practical multiprocessor systems or interconnection

Fig. 7. An augmenting star AAIP

(11)

networks, however, the probability that all the neighbors of a processor are faulty simultaneously is very small. In addition, the small diagnosability of a system is also owing to the fact that it only considers a global status of the entire system but ignores the unlikelihood of faulty processors occurring within a local substructure at the same time.

In this paper, we extend our previous research [18] and study the local diagnosis capability of a conditionally faulty system, in which every fault-free processor is required to have at least one fault-free neighbor. As shown in [18], estimating the local diagnosability with respect to each processor can also be thought of as a new strategy for checking the traditional one-step diagnosability of the

whole system. Under the PMC model, we present a sufficient condition to estimate a given processor’s local diagnosis capability in a conditionally faulty system. More-over, we propose an efficient fault identification algorithm, provided that there can be an augmenting star structure of full order rooted at each processor and the time for a processor to test another one is a constant.

A

CKNOWLEDGMENTS

This work was supported in part by the National Science Council of the Republic of China under Contracts NSC 96-2221-E-009-134-MY3 and NSC 98-2218-E-468-001-MY3, and

Fig. 9. The test assignment for AASS

5ð12345; 4Þ and its syndrome, in which r3¼ 1, r4¼ 0, and r1; r2; r5; r6; r72 f0; 1g.

Fig. 8. The test assignment for AASS

5ð12345; 4Þ and its syndrome, in which r1; r2; r3; r4; r5; r62 f0; 1g.

(12)

in part by the Aiming for the Top University and Elite Research Center Development Plan. The authors would like to express the most immense gratitude to the anonymous referees for their careful reading and constructive com-ments. They greatly improve the quality of the paper. Our gratitude also goes to Professor Timothy Williams, Asia University, for his kindly help with language editing.

R

EFERENCES

[1] S.B. Akers and B. Krishnamurthy, “A Group-Theoretic Model for Symmetric Interconnection Networks,” IEEE Trans. Computers, vol. 38, no. 4, pp. 555-566, Apr. 1989.

[2] J. Altmann, T. Bartha, and A. Pataricza, “An Event-Driven Approach to Multiprocessor Diagnosis,” Proc. Eighth Symp. MicroProcessor and Microcomputer Applications, vol. 1, pp. 109-118, 1994.

[3] F. Barsi, F. Grandoni, and P. Maestrini, “A Theory of Diagnosa-bility of Digital Systems,” IEEE Trans. Computers, vol. 25, no. 6, pp. 585-593, June 1976.

[4] T.A. Bartic, J.Y. Mignolet, V. Nollet, T. Marescaux, D. Verkest, S. Vernalde, and R. Lauwereins, “Topology Adaptive Network-on-Chip Design and Implementation,” IEEE Proc. Computers and Digital Techniques, vol. 152, no. 4, pp. 467-472 , July 2005. [5] J.A. Bondy and U.S.R. Murty, Graph Theory. Springer, 2008. [6] A. Caruso, S. Chessa, P. Maestrini, and P. Santi, “Evaluation of a

Diagnosis Algorithm for Regular Structures,” IEEE Trans. Compu-ters, vol. 51, no. 7, pp. 850-865, July 2002.

[7] A. Caruso, S. Chessa, P. Maestrini, and P. Santi, “Fault-Diagnosis of Grid Structures,” Theoretical Computer Science, vol. 290, no. 2, pp. 1149-1174, Jan. 2003.

[8] A. Caruso, L. Albini, and P. Maestrini, “A New Diagnosis Algorithm for Regular Interconnected Structures,” Lecture Notes in Computer Science, pp. 264-281, Springer-Verlag, 2003.

[9] A. Caruso, S. Chessa, and P. Maestrini, “Worst-Case Diagnosis Completeness in Regular Graphs under the PMC Model,” IEEE Trans. Computers, vol. 56, no. 7, pp. 917-924, July 2007.

[10] S. Chessa and P. Maestrini, “Correct and Almost Complete Diagnosis of Processor Grids,” IEEE Trans. Computers, vol. 50, no. 10, pp. 1095-1102, Oct. 2001.

[11] S.A. Choudum and V. Sunitha, “Augmented Cubes,” Networks, vol. 40, no. 2, pp. 71-84, Sept. 2002.

[12] A. Dahbura and G. Masson, “An OðN2:5Þ Fault Identification Algorithm for Diagnosable Systems,” IEEE Trans. Computers, vol. 33, no. 6, pp. 486-492, June 1984.

[13] A. Das, K. Thulasiraman, V.K. Agarwal, and K.B. Lakshmanan, “Multiprocessor Fault Diagnosis under Local Constraints,” IEEE Trans. Computers, vol. 42, no. 8, pp. 984-988, Aug. 1993.

[14] K. Efe, “A Variation on the Hypercube with Lower Diameter,” IEEE Trans. Computers, vol. 40, no. 11, pp. 1312-1316, Nov. 1991. [15] A.D. Friedman and L. Simoncini, “System-Level Fault Diagnosis,”

The Computer J., vol. 13, no. 3, pp. 47-53, Mar. 1980.

[16] S.L. Hakimi and A.T. Amin, “Characterization of Connection Assignment of Diagnosable Systems,” IEEE Trans. Computers, vol. C-23, no. 1, pp. 86-88, Jan. 1974.

[17] S.-Y. Hsieh and T.-Y. Chuang, “The Strong Diagnosability of Regular Networks and Product Networks under the PMC Model,” IEEE Trans. Parallel and Distributed Systems, vol. 20, no. 3, pp. 367-378, Mar. 2009.

[18] G.-H. Hus and J.J.M. Tan, “A Local Diagnosability Measure for Multiprocessor Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 5, pp. 598-607, May 2007.

[19] A. Kavianpour and K.H. Kim, “Diagnosability of Hypercube under the Pessimistic One-Step Diagnosis Strategy,” IEEE Trans. Computers, vol. 40, no. 2, pp. 232-237, Feb. 1991.

[20] P.-L. Lai, J.J.M. Tan, C.-P. Chang, and L.-H. Hsu, “Conditional Diagnosability Measures for Large Multiprocessor Systems,” IEEE Trans. Computers, vol. 54, no. 2, pp. 165-175, Feb. 2005.

[21] F.T. Leighton, Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann, 1992.

[22] M. Ma´nik and E. Gramatova´, “Boolean Formalisation of the PMC Model for Faulty Units Diagnosis in Regular Multi-Processor Systems,” Proc. 11th IEEE Workshop Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp. 1-2, Apr. 2008.

[23] M. Ma´nik and E. Gramatova´, “Diagnosis of Faulty Units in Regular Graphs under the PMC Model,” Proc. 12th Int’l Symp. Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp. 202-205, Apr. 2009.

[24] H. Masuyama and T. Miyoshi, “A Non-Adaptive Distributed System-Level Diagnosis Method for Computer Networks,” Proc. Int’l Conf. Comm. Systems and Networks, 2003.

[25] W. Najjar and J.L. Gaudiot, “Network Resilience: A Measure of Network Fault Tolerance,” IEEE Trans. Computers, vol. 39, no. 2, pp. 174-181, Feb. 1990.

[26] P.P. Pande, C. Grecu, M. Jones, A. Ivonov, and R. Saleh, “Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures,” IEEE Trans. Computers, vol. 54, no. 8, pp. 1025-1040, Aug. 2005.

[27] F.P. Preparata, G. Metze, and R.T. Chien, “On the Connection Assignment Problem of Diagnosis Systems,” IEEE Trans. Electronic Computers, vol. 16, no. 6, pp. 848-854, Dec. 1967.

[28] Y. Saad and M.H. Shultz, “Topological Properties of Hypercubes,” IEEE Trans. Computers, vol. 37, no. 7, pp. 867-872, July 1988. [29] A.K. Somani, V.K. Agarwal, and D. Avis, “A Generalized Theory

for System Level Diagnosis,” IEEE Trans. Computers, vol. 36, no. 5, pp. 538-546, May 1987.

[30] A.K. Somani and V.K. Agarwal, “Distributed Diagnosis Algorithm for Regular Interconnected Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 42, no. 7, pp. 899-906, July 1992. [31] M. Xu, K. Thulasiraman, and X.-D. Xu, “Conditional

Diagnosa-bility of Matching Composition Networks under the PMC Model,” IEEE Trans. Circuits and Systems-II: Express Briefs, vol. 56, no. 11, pp. 875-879, Nov. 2009.

Cheng-Kuan Lin received the BS degree in applied mathematics from Chinese Culture University, Taiwan, Republic of China, in 2000, and the MS degree in mathematics from the National Central University in 2002. His research interests include interconnection networks, algo-rithms, and graph theory.

Tzu-Liang Kung received the BS degree in industrial administration from the National Taiwan University in 1997, the MS degree in statistics from the National Chiao Tung Uni-versity, Taiwan, in 2001, and the PhD degree in computer science from the National Chiao Tung University in 2009. From 2001 to 2004, he served as a senior engineer at the Behavior Design Corporation, Taiwan. He is currently an assistant professor in the Department of Computer Science and Information Engineering, Asia University. His research interests include multivariate data analysis, machine transla-tion, natural language processing, interconnected systems, fault-tolerant computing, algorithm design, and wireless networks.

Jimmy J.M. Tan received the BS and MS degrees in mathematics from the National Taiwan University in 1970 and 1973, respec-tively, and the PhD degree from Carleton University, Ottawa, in 1981. He has been on the faculty of the Department of Computer Science, National Chiao Tung University, since 1983. His research interests include design and analysis of algorithms, combinatorial optimiza-tion, and interconnection networks.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.

數據

Fig. 1. An n-diagnosable system obtained by integrating an n- n-diagnosable subsystem and m-n-diagnosable subsystem.
Fig. 2 for illustration.
Fig. 3. An extending star of full order rooted at a vertex v 2 V ðQ ð0Þ n Þ.
Fig. 4. The augmenting star A A G ðv; kÞ consists of 3k 2  2k þ 1 vertices and 3k 2  2k edges.
+4

參考文獻

相關文件

• an algorithm for robust fitting of models in the presence of many data outliers. • Compare to

Our model system is written in quasi-conservative form with spatially varying fluxes in generalized coordinates Our grid system is a time-varying grid. Extension of the model to

In summary, the main contribution of this paper is to propose a new family of smoothing functions and correct a flaw in an algorithm studied in [13], which is used to guarantee

If we want to test the strong connectivity of a digraph, our randomized algorithm for testing digraphs with an H-free k-induced subgraph can help us determine which tester should

• Consider an algorithm that runs C for time kT (n) and rejects the input if C does not stop within the time bound.. • By Markov’s inequality, this new algorithm runs in time kT (n)

Since even holes can be efficiently detected in an extended clique tree (see Lemma 4.6, which.. is a slightly faster implementation of the algorithm of da Silva and Vušković [21]),

• Consider an algorithm that runs C for time kT (n) and rejects the input if C does not stop within the time bound.. • By Markov’s inequality, this new algorithm runs in time kT (n)

• Consider an algorithm that runs C for time kT (n) and rejects the input if C does not stop within the time bound.. • By Markov’s inequality, this new algorithm runs in time kT (n)