Accurate Rank Ordering of Error Candidates for Efficient HDL Design Debugging

(1)

Accurate Rank Ordering of Error Candidates

for Efficient HDL Design Debugging

Tai-Ying Jiang, Chien-Nan Jimmy Liu, and Jing-Yang Jou, Fellow, IEEE

Abstract—When hardware description languages (HDLs) are

used in describing the behavior of a digital circuit, design errors (or bugs) almost inevitably appear in the HDL code of the circuit. Existing approaches attempt to reduce efforts involved in this debugging process by extracting a reduced set of error candidates. However, the derived set can still contain many error candidates, and finding true design errors among the candidates in the set may still consume much valuable time. A debugging priority method [21] was proposed to speed up the error-searching process in the derived error candidate set. The idea is to display error candidates in an order that corresponds to an individual’s degree of suspicion. With this method, error candidates are placed in a rank order based on their probability of being an error. The more likely an error candidate is a design error (or a bug), the higher the rank order that it has. With the displayed rank order, circuit designers should find design errors quicker than with blind searching when searching for design errors among all the derived candidates. However, the currently used confidence score (CS) for deriving the

debugging priority has some flaws in estimating the likelihood of

correctness of error candidates due to the masking error situation. This reduces the degree of accuracy in establishing a debugging

priority. Therefore, the objective of this work is to develop a new

probabilistic confidence score (PCS) that takes the masking error situation into consideration in order to provide a more reliable and accurate debugging priority. The experimental results show that our proposed PCS achieves better results in estimating the likelihood of correctness and can indeed suggest a debugging

priority with better accuracy, as compared to the CS.

Index Terms—Error diagnosis, hardware description language

(HDL), HDL code debugging, verification.

I. INTRODUCTION

W

ITH THE increasing complexity of very large scale integration circuit designs, the design cycle of a digital circuit often involves several design stages. Verification is used to monitor the consistency in designs between subsequent stages. When the verification process finds that a design in the current stage (implementation) is not consistent with that in the previous stage (specification), design error diagnosis is needed.

Manuscript received March 18, 2008; revised September 26, 2008. Current version published January 21, 2009. This paper was recommended by Associate Editor V. Narayanan.

T.-Y. Jiang is with the Department of Electronics Engineering, National Chiao Tung University, Hsinchu 300, Taiwan (e-mail: taiyingjiang@realtek. com.tw).

C.-N. J. Liu is with the Department of Electrical Engineering, National Central University, Jhongli City 320, Taiwan.

J.-Y. Jou is with the Department of Electronics Engineering, National Chiao Tung University, Hsinchu 300, Taiwan and is Vice Chancellor (Academic Affairs) of the University System of Taiwan.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCAD.2008.2009163

Most of the previous studies on the topic of design error diagnosis target the diagnosis of gate-level or lower level implementations. These methods can be roughly divided into two categories: 1) symbolic approaches and 2) simulation-based approaches. Symbolic approaches [1]–[6] use a binary decision diagram (BDD) to represent functional manipulation with BDD to formulate the necessary and sufficient condition of fixing design errors. Some recent symbolic works exploit the progress of a Boolean satisfiability (SAT) solver and develops SAT-based approaches [6]. On the other hand, simulation-based approaches [7]–[14] rely on simulating erroneous test vectors to gradually reduce impossible error candidates. Some of these methods are error-model-based methods, some are structure-based methods, and some are a combination of the two. Such error models include gate errors (missing gate, extra gate, wrong logical connective, etc.), line errors (missing line, extra line, etc.), and unknown models (Xs).

In addition to the gate or other lower levels, design errors can also occur at the very first design stage, i.e., modeling the circuit behavior using hardware description languages (HDLs). Tra-ditionally, debugging a faulty HDL design relies on manually tracing the faulty HDL code. However, a simple HDL design today can probably have thousands of code lines. Manually tracing the faulty HDL code to debug is not an effective debugging method.

In the literature, some researches have targeted techniques that assist HDL design debugging. Peischl and Wotawa [20] focused on the model-based diagnosis paradigm and employed structures and behaviors for source-code-level debugging with respect to their error models. There are also error-model-free methods, which should have better applicability to various kinds of design errors. Khalil et al. [15], [16] proposed an approach that can automatically derive four sets of error can-didates in a sequence, i.e., from the smallest set to the biggest set. In doing so, they hope that designers can find bugs in the smaller error candidate sets to reduce the required debugging efforts. However, this is not always possible, and the efforts in reviewing the first three sets may be wasted. Shi and Jou [17] applied data dependence analysis, execution statistics, and the characteristics of HDL operations to filter out impossible error candidates. In this method, only one error space is derived for examination, and it is acceptable in size. Huang et al. [18] fur-ther exploited the extra observability of assertions in an attempt to derive a smaller error space. Instead of automatic methods, Hsu et al. developed two useful utilities to help designers explain the locality of bugs with manual interventions [19].

(2)

(called “error space” in this paper) can still contain many error candidates, and identifying true design errors by examining candidates one by one still requires much effort and time. An interesting technique called debugging priority has been pro-posed for accelerating error searching in the derived candidate set [21]. A measurement called confidence score (CS) has been developed to assess the likelihood of correctness of each error candidate. Then, by sorting error candidates according to the CS, error candidates are displayed in a prioritized order: from the most likely to the least likely one. With this ranked order, true design errors would be placed in the first few lines and should be found by designers if they search errors according to the order. However, here, it is implicitly assumed that the

masking error situation, in which the erroneous effects caused

by design errors cannot be observed at the outputs, seldom occurs and is not considered within the CS [21]. Without con-sidering the masking error situation, the CS may overestimate the likelihood of correctness of the true design error, as the latter may not be placed in the first few lines, as expected. Unfortunately, the benefits of debugging priority are limited.

Therefore, this paper attempts to develop a new probabilistic measurement called probabilistic confidence score (PCS) that takes the masking error situation into consideration to more suitably estimate the likelihood of correctness of an error candidate. Instead of proposing a new approach for deriving a reduced set of error candidates, this paper focuses on obtaining a more accurate debugging priority to speed up the error-searching process among the candidates in the derived error space. Designers can apply any approaches to derive an error space and then use our method to obtain a reliable debugging

priority. The inputs of our algorithm are given as follows:

1) an error space; 2) a faulty HDL design; and 3) a value change dump file obtained during the simulation. The output will be a candidate list, with the debugging priority sorted according to our proposed PCS.

The remainder of this paper is organized as follows: The motivation of this work is introduced in Section II. Section III describes the modeling of the likelihood that a masking error

situation occurs and defines the PCS. An effective PCS

calcu-lation algorithm is introduced in Section IV. The experimental results are presented in Section V. Finally, we conclude this paper in Section VI.

II. MOTIVATION

When estimating the likelihood of correctness of an error candidate to obtain the debugging priority, Jiang et al. assumed that the erroneous effects of activated errors are seldom masked and can often be propagated to the primary outputs (POs) [21]. With this assumption, if there were no incorrect simulation values at the POs (PO1, PO2, PO3, . . . , POm) at time instance t = ti, the sensitized statements (SSs) of the POs tended to be correct.1 _{As a result, the SSs of the POs receive CS points}

according to the formula of the CS. In short, the CS did not model the masking error situation to estimate the likelihood

1_{Otherwise, erroneous values caused by the SSs should make the simulation}

values of the POs inconsistent with the expected values at t = ti.

Fig. 1. HDL example. (a) HDL code. (b) Input stimulus. (c) Simulation result.

Fig. 2. Erroneous simulation results and debugging priority. (a) Erroneous simulation result. (b) Debugging priority and the CS.

of correctness of the error candidates. However, many HDL operations can actually mask the erroneous effects and prevent them from appearing at the POs. If the erroneous effects were masked, preventing them from being observed at the POs, the SSs may get CS points, even if some of them have design errors hiding within. An example follows to illustrate this point.

Suppose that the HDL code that a designer intends to write is the Verilog HDL code in Fig. 1(a). This HDL code has only one PO, i.e., PO1, on which the simulation results are compared against the expected values. The clock period of the clock signal

clk is assumed to be 10 ns. If the HDL code is simulated with the

input stimulus shown in Fig. 1(b), the simulation result obtained is shown in Fig. 1(c). We consider the result in Fig. 1(c) as the golden result, because the code in Fig. 1(a) is what the designer intends to write.

However, if the statement in line 7 (which is denoted as S7) “counter = counter + 1” is carelessly written as “counter =

counter + 2,” the simulation result will become that shown in

Fig. 2(a). It can be seen that the simulation value of PO1 at

t = 25 in Fig. 2(a) is different from its expected value shown

in Fig. 1(c). According to the definition in [21], PO1 is an erroneous PO, and the clock cycle ranging from T ime = 15 to

T ime = 25 is the error-occurring clock cycle (EOC). By using

the error space identification approach in [21], an error space, i.e..{S1, S2, S3, S4, S5, S6, S7}, can be obtained.

After obtaining the error space, the CS for each error candidate will be calculated. Each SS of a PO with correct simulation at a time instance before the EOC gets one CS point. Finding SSs requires backward tracing from the POs in the reverse direction of the data flow until the primary inputs (PIs), registers, or constants are reached. When reaching a conditional vertex, such as S2 and S5 in Fig. 1(a), the authors propose

(3)

to traverse the taken branch(es) and the control signal, and ignore the untaken branch(es). For example, at T ime = 1, since the evaluation result of “if(reset)” at T ime = 1 is TRUE, the traversal reaches S5 and then backward traverses the TRUE branch and the control signal (i.e., reset). The obtained SSs for PO1 at T ime = 1 are{S5, S6}, and both receive one CS point. As can be seen, all the traversals must commence with one PO. Each PO traversal is completed in a particular simulation instance. This process is repeated until all the PO traversals in a particular simulation instance have been completed. Finally, once all the PO traversals in all the simulation instances have been completed, the debugging priority shown in Fig. 2(b) is obtained. The numbers within parentheses are the CSs of the corresponding error candidates.

It can be seen that the design error in statement S7 “counter = counter + 2” is not placed in the first line but the fifth line in Fig. 2(b). If circuit designers examine error candidates according to this debugging priority, four trials would be wasted before the true error S7 can be found. Design error S7 is placed in the fifth line, because S7 receives two CS points; this is because the erroneous values caused by S7 are masked twice on its way propagating to PO1.

The first masking error situation occurs at T ime = 5. It can be seen that the erroneous statement S7 causes an incorrect value (i.e., 3, which is different from the correct value shown in Fig. 1(c), i.e., 2, as highlighted using an underline) to be displayed on the signal counter at T ime = 5. However, this in-correct value 3 is masked by the operation “counter < PI2” in S2, because both the correct (i.e., 2) and incorrect values (i.e., 3) yield the same result at the output of the operation “counter < PI2,” i.e., they are both smaller than the value of PO2 (i.e., 4). A similar masking error situation occurs at T ime = 15 although the incorrect value of counter is propagated through the output of “counter < PI2,” causing it to be FALSE. However, the incorrect result (i.e., FALSE) does not alter the value of PO1 (i.e., 4). This means that signal a is 4 at T ime = 15. It is masked by the conditional operation “if (. . .) . . . else . . .” and cannot cause incorrect values at PO1 at T ime = 15. Because the CS does not consider the possible masking error situation that may be caused by the operation “counter < PI2” and the conditional operation “if (. . .) . . . else . . .,” S7 is given a CS score of two points. This puts S7 in line 5 in the candidate list in Fig. 2(b). The accuracy of the debugging priority is reduced due to the lack of consideration for the masking error situation of the CS.

Observing the masking error situation, this work aims at estimating the likelihood of error masking (LOEM) for an SS to assess the score it can receive. If the LOEM of an arbitrary SS, i.e., SSi, is quite low, it is almost impossible for the masking error situation to occur on SSi’s way to the POs. It should be comparatively safe to consider SSi as a correct statement and give SSia high score. On the contrary, if the LOEM of SSiis high, it should be given a low score.

III. PCS

In the following introduction, the input-faulty HDL design is modeled as a modified control/data flow graph (CDFG) G =

Fig. 3. CDFG of the HDL code in Fig. 1.

(V , E), where V is the set of vertices and E is the set of edges connecting the vertices. Let v be a vertex in V . Each vertex

v corresponds to an operation in the HDL code. Function fv and variable yvare also associated with vertex v. Function fvis the function of the operation to which v corresponds. Variable

yv is the output variable of fv or the left-hand variable of the operation. The Verilog HDL code fragment in Fig. 1(a) is used as an example, and its CDFG is constructed, as shown in Fig. 3. Vertex “1 :∗” corresponds to operation “a = PI1 ∗ 4” in the statement in line 1 (S1). Function f1:∗ is multiplication

“(∗),” and y1:∗ is signal a. Vertex “2 : if (. . .) . . . else . . ..”

corresponds to the operation “if (. . .) . . . else . . .” in lines 2–4. The functionality of vertex “2 : if (. . .) . . . else . . .” is quite similar to that of a multiplexer. Vertex PO1 is a special vertex representing the only PO of the circuit, i.e., PO1.

Edge (v, u)∈ E indicates that the input of vertex u is data dependent on the output of v. As shown in Fig. 3, an edge (1 :∗, 4 :=) exists since the operation “4 :=” takes the output of vertex “1 :∗” as its input. The fan-out of v is a set of vertices u, such that there is an edge from v to u. The fan-in of v is a set of vertices k, such that there is an edge from

k to v. A path P from vertex u to vertex u is a sequence of verticesv0, v1, v2, . . . , vk, such that u = v0, u= vk, and (v_i−1, vi)∈ E.

Suppose that the verification finds an incorrect circuit be-havior at the nth positive edge of the clock signal t = cn.2 This special positive edge of the clock is called the error-occurring edge (EOE). Assume that the faulty design under verification (DUV) has m POs{PO1, PO2, . . . , POm} and that n− 1 clock cycles pass before the EOE (t = cn). To show how we model the masking error situation and estimate LOEM, we first consider that a design error hides within an arbitrary statement v. If the erroneous statement v caused an incorrect value w on its left-hand variable yv at time instance t = ti, this incorrect value w would not cause any incorrect behaviors at any POs at all the rising edges of the clock before t = cn. Otherwise, the EOE is not t = cn but another earlier rising edge of the clock. More specifically, for an arbitrary POj at an arbitrary rising edge of clock t = ck before the EOE, the incorrect value w is masked by some vertices on the paths from statement v at t = ti (which is denoted as v@t = ti) to POj at t = ck (which is denoted as POj@t = ck), causing the

2_{We assume that the simulation values of all the POs are compared with}

the correct values only on the rising edges of the clock signal. If the DUV is a falling-edge-triggered or double-edge-triggered design, the modeling and computation algorithm can easily be changed to fit to it.

(4)

simulation value of POj to be the same as the correct value at t = ck, i.e.,

fv@t=ti→POj@t=ck(w) = CV (POj@t = ck) (1)

where fv@t=ti→POj@t=ck is the function of the paths from

v in time frame t = ti to POj in time frame t = ck, and CV (POj@t = ck) is the correct value of POjat t = ck.

For all the other POs of the DUV, the incorrect value w would also be masked on the way to the POs at all the rising edges before the EOE, so that it could remain uncovered before EOE. That is, for each PO POj at each rising edge of clock t = ck before EOE, the function of the path(s) from vertex v at t = ti to POj at t = ck must generate the correct value of POj at t = ck with w, even if w is an incorrect value. The preceding description can be modeled as

m j=1 n−1 k=0 fv@t=ti→POj@t=ck(w) = CV (POj@t = ck). (2)

We now consider the likelihood that the incorrect value w truly exists on yv but is masked from causing any incorrect values on the POs at any time instances before the EOE. We first notice that all the possible values of yv that can satisfy (2) form a special set of values. We call this the masked value set (MVS) of vertex v at time instance t = ti, which is denoted as MVS(v@t = ti) and is given by MVS(v@t = ti) = x m j=1 n−1 k=0 fv@t=ti→POj@t=ck(x) = CV (POj@t = ck) . (3)

Each element in MVS(v@t = ti) retains the correct values of all the POs at all the rising edges of the clock before the EOE, regardless if it is a correct value or not. The correct value of the output of vertex v at t = ti is, of course, contained in MVS(v@t = ti). This justifies the existence of MVS(v@t = ti). If MVS(v@t = ti) contains only one element, obviously, it will be the correct value of yv at t = ti. In this case, no incorrect values ever exist in MVS(v@t = ti), and the masking error situation can never occur. Statement v at t = tiis given a high score. On the other hand, if the set contains many elements, an incorrect value is very likely to exist in the set and become an incorrect value that remains unrevealed at all the rising edges of the clock before the EOE. The correctness of statement v is less obvious. In other words, the more elements MVS(v@t = ti) contains, the more likely the simulated value of v at t = ti is a masked incorrect value. Hence, we define the LOEM of statement v at time instance t = tias follows:

LOEM(v@t = ti) = |MVS(v@t = ti )| − 1

2n− 1 . (4)

Its complement is the likelihood that an erroneous value of

v at t = ti is propagated to at least one PO before the EOE

and observed, i.e., the likelihood of error propagating (LOEP) of v at t = ti, which is given by

LOEP(v@t = ti) = 1−|MVS(v@t = ti )| − 1

2n− 1 . (5)

In the given input value change dump file, the output variable

yv of an arbitrary statement v can change its value many times, e.g., l times, at different time instances before the EOE

{t = t1, t = t2, . . . , t = tl}. Each time the value of yvchanges at time instance t = ti, there will be one particular value of LOEP(v@t = ti). The PCS of v, i.e., PCS(v), is defined as the maximum among these LOEP values, as described in

PCS(v) = max{LOEP(v@t = ti)} ,

where ti∈ {t = t1, t = t2, . . . , t = tl}. (6) A low LOEP (high LOEM) means that any erroneous effects caused by v at t = ti are very possible to be masked. The correctness of v at t = ti may become doubtful, even if the simulation values of all the POs are correct before the EOE. It is reasonable to give v less PCS due to its small LOEP value. On the other hand, if the LOEP value is high, it is equally reasonable to give it more PCS. Therefore, we define PCS as (6). It can be seen that PCS computation now turns into the problem of efficiently computing the MVSs of each error candidate at different time instances before an EOE.

IV. PCS COMPUTATIONALGORITHM

The proposed PCS computation algorithm is a topology-based analysis with time frame expansion for handling the sequential behavior of the DUV. While calculating the LOEP of the output variable of vertex v in time frame t = ti, the algorithm will consider each sensitized path from v in time frame t = ti to any PO in each time frame before the EOE. This path-oriented computation scheme is defined as follows: MVS(v@t = ti) = m j=1 n−1 k=0 x|fv@t=ti→POj@t=ck(x) = CV (POj@t = ck)} (7) which can be derived from (3).

The set {x|fv@t=ti→POj@t=ck(x) = CV (POj@t = ck)} is

defined as the MVS of vertex v at time instance t = ti with respect to POj at t = ck (which is denoted as MVS(v@t = ti)POj@t=ck). An element of the set other than the correct value

can be regarded as an incorrect value that is masked by some vertices on the path(s) from v at t = tito POj at t = ck, thus keeping the correct value of POjat t = ck.

According to (7), if it is possible to derive MVS(v@t =

ti)POj@t=ck for each POj at each time frame t = ck, then in-tersecting these sets yields MVS(v@t = ti). If there is exactly one path from v at t = tito a PO POjat t = ck, an induction-based computation approach is proposed to compute an exact MVS(v@t = ti)POj@t=ck, which is introduced in Sections IV-A

and IV-B. If there are multiple paths from v at t = ti to POj at t = ck, i.e., reconvergent paths, a quick estimation approach that guarantees lower bound LOEP estimations will be applied,

(5)

Fig. 4. Path from b@t = tito POj@t = ck.

Fig. 5. Pseudocode of the MVS computation for a single path.

which is introduced in Section IV-C. Two time-saving strate-gies are introduced in Section IV-D. The entire algorithm is presented in Section IV-E and incorporates each part that was introduced before.

A. MVS Computation for a Single Path

Assume that there is a single path P from a vertex b at time instance t = ti to a PO POj at a rising edge of clock t = ckb@t = ti, an, an−1, . . . , a2, a1, POj@t = ck, as shown in Fig. 4. An algorithm written as a pseudocode, as shown in Fig. 5, is used to compute MVS(b@t = ti)POj@t=ck.

For each PO at each rising clock edge, the algorithm in Fig. 5 will recursively call subroutine MVS_for_Vertex to perform an MVS computation and use a depth-first search strategy for backward traversals (from POj at t = ck to b@t = ti). The input of the subroutine comprises a previously computed set of integers (PreviousMVS), the currently traversed vertex v, and the current time frame ti. If the currently traversed vertex v is a normal vertex, all the fan-in vertices of vertex v will be traversed (line 7). However, if vertex v is a control vertex, the

fan-in vertices on the untaken branch(es) will be marked as

“inactive” and will not be traversed (line 5).

The key step of this algorithm (line 12) is the computation of a set of all u’s output values (CurrentMVS) that can make the function of v, i.e., fvgenerate an output value that is in the set PreviousMVS. Then, the newly computed set CurrentMVS will

become the input PreviousMVS of subroutine MVS_for_Vertex and will be recorded on vertex u, along with the time in-formation after the subroutine is called again. Section IV-B

introduces the procedures for computing CurrentMVS based on

PreviousMVS (line 12). The following is an explanation of how

this algorithm can derive MVS(b@t = ti)POj@t=ck, in the case

of a single path from b at t = tito POjat t = ck.

Theorem 1: As shown in Fig. 4, function fnis the composite function of the vertices from a1to an and comprises fan and

fn−1. For an arbitrary value x on the output of vertex b at t = ti, x is in MVS(b@t = ti)POj@t=ck if and only if fan(x) is in

MVS(c@t = ti)POj@t=ck, which can be represented as

MVS(b@t = ti)POj@t=ck =x|fan(x)∈ MVS(c@t = ti)POj@t=ck . (8) Proof: Claim 1 MVS(b@t = ti)POj@t=ck ⊇x|fan(x)∈ MVS(c@t = ti)POj@t=ck .

For each value x contained in {x | fan(x)∈ MVS(c@t =

ti)POj@t=ck}, x must satisfy fn−1(fan(x)) = CV (POj@t =

ck) and thus also satisfy fn(x) = CV (POj@t = ck). That is, x is contained in MVS(b@t = ti)POj@t=ck. This proves Claim 1.

Claim 2

MVS(b@t = ti)POj@t=ck

⊆x|fan(x)∈ MVS(c@t = ti)POj@t=ck

.

By way of contradiction, first, assume that there is a value

x that is in MVS(b@t = ti)POj@t=ck, but fan(x) is not

in MVS(c@t = ti)POj@t=ck. Since x is in MVS(b@t =

ti)POj@t=ck, then fn(x) = CV (POj@t = ck), which im-plies that f_n−1(fan(x)) = CV (POj@t = ck). This means that

fan(x) is contained in MVS(c@t = ti)POj@t=ck. This is a

contradiction.

From Claims 1 and 2, it is proven that MVS(b@t = ti)POj@t=ck

=x|fan(x)∈ MVS(c@t = ti)POj@t=ck

.

When subroutine MVS_for_Vertex is called for the first time, the computed CurrentCVS{x|fa1(x)∈ {CV (POj@t = ck)}} is actually MVS(g@t = ck)POj@t=ck according to the

definition. When the subroutine is called for the second time, the computed CurrentMVS {x|fa2(x)∈ MVS(g@t = ck)POj@t=ck} should be MVS(e@t = ck)POj@t=ck

accord-ing to Theorem 1. Similarly, the computed

Current-MVS{x|fa3(x)∈ MVS(e@t = ck)POj@t=ck} is MVS(d@t =

ck)POj@t=ck when the subroutine is called for the third time.

Therefore, when the computation reaches vertex an, the com-puted CurrentMVS{x|fan(x)∈ MVS(c@t = ti)POj@t=ck} is

the MVS of b at t = tiwith respect to POjat t = ck.

From the preceding discussion, it shows that a CurrentMVS set is the MVS of a traversed vertex with respect to POj at t = ck. According to (7), these MVSs will be intersected by other MVSs of the same vertex with respect to the other POs

(6)

at different time instances. After all the POs at all the positive clock edge have been applied, the MVS of each traversed vertex in a time frame will be computed and recorded for the PCS calculation later in the process.

B. MVS Formula for Operations

Given a previously computed MVS set (PreviousMVS), ver-tex v, and one of the fan-in vertices u of v, CurrentMVS is the set of all the values at u’s output yu that make the function of vertex v, i.e., fv, generate an output value that is in PreviousMVS. To compute CurrentMVS, we first consider a particular value p in PreviousMVS and try to find the set of all the values that make fv generate p at v’s output yv. If such a set can be derived for each particular value p in

Previous-MVS, then the union of these sets for each p in PreviousMVS

yields CurrentMVS. We denote this special set for value p as

Sub_CurrentM V Sp.

For most unary and binary operations, Sub_CurrentM V Sp can easily be derived by inversing fv. Take the operation “yv= −yu” as an example. If p =−2, inversing the minus operation “−” produces yu= 2. Take the operation “yv= yu+ b1” as

another example. If p = 8 and b1= 3, inversing “+,” i.e.,

yu= 8− 3, shows that yu is equal to 5. Integer b1 is the

simulated value of an operand other than the output of u yu and is recorded in the dump file. The formula for computing

Sub_CurrentM V Sp is summarized in the third column of Table I. The second column shows the necessary conditions for the result of Sub_CurrentM V Sp to exist. If the conditions are not met, in most of cases, Sub_CurrentM V Sp={∅}, except in comparisons. The derivation of Sub_CurrentM V Sp formulas for some representative operations is explained here.

1) Operations that choose a bit range (“[i]” and “[i : j]”): For operation “[i : j],” the only constraint on the input values is that the binary values of the bits selected by “[i : j]” must be the same as the those of value p. The binary values of the unselected bits are arbitrary and can be of any possible value. Thus, the value of the unselected bits from 0 to j− 1 can be any integer ranging from 0 to 2j− 1. The value of the unselected bits from i + 1 to

w− 1 can be any integer ranging from 0 to 2w−i−1_{− 1.} Hence, the formula for operation “[i : j]” appears in the third column of Table I. Sub_CurrentM V Sp for “[i]” can be derived by treating i the same as j in the “[i : j]” formula.

2) Control vertices (“if (. . .) . . . else . . .” and “case(. . .)”): If yu is the control signal, yu can only have values that select suitable branches to keep the output of vertex v yv at p. This computation can be done by comparing the value of each variable on each branch with p. If yuis the signal on the taken branch, yu can only be p, such that yvis p.

3) Comparison operations (“>,” “<,” “==,” etc.): Take “<” as an example. If p is equal to 1, yu can only have values smaller than b1. These values are{[0 ∼ b1− 1]}.

The derivations for other comparisons are quite similar. 4) Right shift “” and left shift “”: Either a right shift

or a left shift by b1 incurs information loss. The “[i : j]”

Fig. 6. Modeling information loss in the right and left shifts.

formula can tackle the information loss. As shown in Fig. 6, the entire right shift (left shift) is the cascade of an operation that selects the bit range from i to j “[i : j]” and a divide (multiply) operation. Therefore, to derive the formula of a right shift (left shift) operation, the divide (multiply) formula is first applied; then, the “[i : j]” formula is applied. For formula derivation of other operations, such as “+,” “−,” and “∗,” the “[i : j]” MVS formula can also be applied to model the

informa-tion loss if encountered.

If the formulas listed in the third column of Table I are directly applied to compute CurrentMVS, for a PreviousMVS with n integers, the formula should be applied n times before deriving the union of all the Sub_CurrentM V Sp to produce CurrentMVS. Take the operation “b = a[1 : 0]” as an

exam-ple. Assume that a is four bit wide, b is two bit wide, and

P reviousM V S ={0, 1, 2}. To compute CurrentMVS, first,

the “[i : j]” formula with i = 1, j = 0, w = 4, and p = 0 is applied, which yields

24−1−1₋₁

k=0

[0· 20+ k· 21+1∼ 0 · 20+ k· 21+1+ 20− 1] ={0, 4, 8, 12}. (9) The same formula can be used, with p = 1 and p = 2 in sequence, to obtain {1, 5, 9, 13} and {2, 6, 10, 14}, respectively. The union {0, 1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 14} is

Current-MVS. The computation may take much time if there are many

elements in PreviousMVS. The following observations are used to refine the Sub_CurrentMVS formulas for deriving the MVS formulas used in the subroutine MVS_for_Vertex.

Taking a closer look at the results obtained with p = 0, p = 1, and p = 2, it can be observed that 0∗ 20_{+ k}_{∗ 2}1+1_{+ 2}0₋₁₌

k∗ 21+1 _{and 1}_{∗ 2}0_{+ k}_{∗ 2}1+1_{= k}_{∗ 2}1+1_{+ 1 are two}

con-tinuous integers, as well as 1∗ 20_{+ k}_{∗ 2}1+1_{+ 2}0_{− 1 = k ∗}

21+1_{+ 1 and 2}_{∗ 2}0_{+ k}_{∗ 2}1+1_{= k}_{∗ 2}1+1_{+ 2. Therefore,}

the union of the preceding three sets can more concisely be represented as 24−1−1₋₁ k=0 [0· 20+ k· 21+1∼ 2 · 20+ k· 21+1+ 20− 1] = 3 k=0 {[k · 4 ∼ 2 + k · 4]} . (10)

(7)

TABLE I

Sub_CurrentM V SpFORMULAS

More generally, for a set of continuous integers from p to q in PreviousMVS, the computed CurrentMVS is

2w−i−1−1 k=0

[p× 2j+ k× 2i+1∼q × 2j+ k× 2i+1+ 2j− 1].

(11) The “[i : j]” MVS formula is derived and listed in the third column of Table II.

The “” operation is another example of the derivation of the “” formula listed in the third column of Table II. First, the smallest integer pin the set{[p ∼ q]} is found, which satisfies

p%2b1_[b

1] = 0. If there is no such p in the set {[p ∼ q]},

CurrentMVS will be φ. If pexists in{[p ∼ q]}, p+ 2b1_needs

to be is in the range of p−q. If it is, the union of the two result sets obtained by pand p+ 2b1_{can be expressed as}

2b1₋₁

∪ k=0

[k· 2b1_{+ p}_/2b1 ∼ k · 2b1_{+ (p}_{+ 2}b1_)/2b1_]_{. (12)}

Repeating the preceding derivations produces the “” for-mula in Table II.

For a subset of integers{[p ∼ q]} in PreviousMVS, applying the MVS formulas listed in the third column of Table II can much more quickly produce results than the application of the formulas in Table I. In addition, all the integers in the subset

{[p ∼ q]} can be memorized by recording only p, q, and the

special tag “∼.” This storage format enhances memory usage and alleviates the problem of memory explosion.

C. MVS Estimation for Reconvergent Paths

The algorithm shown in Fig. 5 can compute the exact MVS of vertex b in time frame t = ti with respect to POj in time frame t = ck only if there is a single path from b at t = ti to POjat t = ck. If there are multiple reconvergent paths, another approach is needed.

A quick estimation strategy for handling reconvergent paths, instead of some other complex methods that may need more computation time, is adopted in this paper. If there are multiple

reconvergent paths from v at t = ti to POj at t = ck, the universe U is used, instead of real MVS(b@t = ti)POj@t=ck

in the intersection operation. The estimation result obtained using the universe must include the exact result obtained by intersecting the real MVS(b@t = ti)POj@t=ck, as the latter is

included in the universe U . Consequently, this estimation result has a larger MVS set. That causes the estimated LOEP to be less than the real one. Therefore, this estimation approach guar-antees lower bound estimations of LOEP for an arbitrary error candidate v on reconvergent paths. This reduces the possibility of the correctness of any design errors to be overestimated and the design errors to be placed in the first few lines of a

(8)

TABLE II

MVS FORMULAS FORHDL OPERATIONS

Fig. 7. Vertex v and one of its fan-in vertex v.

D. Time-Saving Strategies

To reduce the computation time, we develop the bounded

tra-versal strategy and the limited traversed frame (LTF) strategy.

The bounded traversal strategy can avoid unnecessary traver-sals during MVS computation without causing any accuracy loss. The LTF strategy saves additional time at the expense of accuracy loss. It provides additional flexibility for tool users to determine how they like to trade off between accuracy and time.

1) Bounded Traversal Strategy: In the proposed MVS

com-putation, after some backward traversals, there are MVS sets recorded on vertices that have been traversed. As shown in Fig. 7, let vertex v at t = tn be one of the fan-in vertices of vertex v at t = tn, and both were also traversed. As-sume that MVS(v@t = tn) and MVS(v@t = tn) are already recorded on v and v. MVS(v@t = tn) should be{x | fv(x)∈ MVS(v@t = tn)} according to the CurrentMVS computation shown in line 12 of the MVS_for_Vertex pseudocode in Fig. 4.

If another backward traversal from a PO arrives

at vertex v in time frame t = tn, PreviousMVS and MVS(v@t = tn) are intersected, as described in line 4 of the MVS_for_Vertex pseudocode. If the result of the intersection

remains to be MVS(v@t = tn), i.e., MVS(v@t = tn)⊆ P reviousM V S, then, when the computation reaches v, the result of the intersection will also be MVS(v@t = tn). More specifically, if MVS(v@t = tn)⊆ P reviousMV S, then MVS(v@t = tn)⊆ {x|fv(x)∈P reviousMV S}. Theorem 2 provides a statement and proof.

Theorem 2: If MVS(v@t = tn)⊆ P reviousMV S, then MVS(v@t = tn)⊆ {x | fv(x)∈ P reviousMV S}. The orig-inally recorded MVS(v@t = tn) remains unchanged after the intersection.

Proof: The MVS(v@t = tn) is computed based on MVS(v@t = tn). That is, MVS(v@t = tn) is the set {x| fv(x)∈MVS(v@t=tn)}. For an arbitrary element x in MVS(v@t = tn), fv(x) is in MVS(v@t = tn) and is thus also in PreviousMVS since MVS(v@t = tn)⊆P reviousMV S. There-fore, if MVS(v@t = tn)⊆P reviousMV S, then MVS(v@t = tn)⊆{x|fv(x)∈P reviousMV S}. The originally recorded MVS(v@t = tn) remains unchanged after the intersection.

If v has at least one fan-in vertex v, by mathematical

(9)

Fig. 8. Pseudocode of the PCS computation algorithm.

after the intersection. So do the vertices that are in the transitive

fan-in of vertex v. Therefore, when PreviousMVS includes

the recorded MVS of a vertex v, an immediate return from subroutine MVS_for_Vertex can avoid unnecessary traversals and computations since further computations will not change any recorded MVSs.

2) LTF Strategy: The bounded traversal strategy can avoid

unnecessary traversals. However, in some cases, necessary backward traversals can still expand many frames. Although accurate results are produced, the required computation time may become unaffordable. Therefore, we propose the LTF strat-egy, which provides an optional and flexible tradeoff between accuracy and speed.

The idea of the LTF strategy is to restrict the number of backward-traversed frames in time frame expansion. It only requires a simple check on whether the number of expanded frames reaches the maximum allowable number of frames (which is denoted as the f rame_limit). f rame_limit is a configurable parameter that can be adjusted by the users. It can be set as a small number for a quick estimation or as infinite for the highest accuracy. Unlike the bounded traversal strategy, this strategy may experience some accuracy loss. However, a lower bound estimation of observability is always guaranteed, such that our observability measures seldom overestimate the correctness of the DUV. The reason is given here.

For a vertex u in time frame t = ck, if the expanded frames are not limited, each MVS of u at t = ck will be intersected with respect to an observation point at a positive clock edge in the set of MVS sets{MVS1, MVS2, . . . , MVSm}. With the f rame_limit restriction, some MVSs of u at t = ck with respect to some OPs are not obtained since the backward

traver-sals are bounded and do not reach u in time frame t = tk. As-sume that the obtained MVSs are{MVS1, MVS2, . . . , MVSn}, where n < m. The intersection of all the MVSs in the set

{MVS1, MVS2, . . . , MVSn} includes the intersection of all the MVSs in the set {MVS1, MVS2, . . . , MVSm}. Larger MVS set intersections turn out to be less observable ac-cording to the definition of observability in (5). Therefore, our LTF strategy also guarantees lower bound estimations of observability.

E. PCS Computation Algorithm

To incorporate MVS estimation for reconvergent paths and the time-saving strategies, we modify the algorithm shown in Fig. 4 and derive the one abstracted as pseudocode in Fig. 8. Hence, the algorithm that we use to calculate PCS incorporates the following: 1) MVS computation for a single path; 2) MVS estimation for reconvergent paths; 3) the bounded-traversal strategy; and 4) the LTF strategy. The input of this algorithm are given as follows: 1) the DUV described in an HDL; 2) the value change dump file during simulation; 3) the EOE; 4) an error

space obtained by any error space identification approach; and

5) the f rame_limit value selected by the tool user.

During traversal(s) that starts from a PO (StartPO) at a time instance (StartTime), if vertex v is visited for the first time, the single-path case is temporarily assumed. The PreviousMVS will be intersected with MVS(v@t = ti), which is already the intersection of many PreviousMVSs. However, if this vertex v is found to be traversed for two or more times in the traversal starting from StartPO at StartTime, there are reconvergent paths from v at t = tito StartPO at StartTime. Then, the previously

(10)

Fig. 9. Computation processes starting from PO1 at t = 1.

recorded M V Sf orRecovery(v@t = ti) is used to cancel the intersections made in this traversal before.

Two conditions are added for incorporating the two time-saving strategies. The condition in line 5 of the

MVS_Com_for_Vertex subroutine is added to incorporate the bounding traversal strategy. The last condition in line 16 is

added because of the LTF strategy. Once one of the conditions is met, the succeeding computation processes can be skipped, and the program can directly return from the subroutine to save computation time. Aside from being bounded by time-saving strategies, traversals are also bounded if there is no frame to expand (th< 0) or there is no fan-in vertex to traverse.

The preparation phases of this algorithm are shown in lines 1 and 2 of CS_computation. The three-address code generations and the conditional statement modification developed in [22] must be conducted first for the information required in the MVS computation for control vertices (conditional statements). The detailed conditional statement modification algorithm can be found in [22]. Next, a CDFG based on the input DUV described in HDL is constructed.

The example in Fig. 1 is used to demonstrate the processes of our PCS computation and its performance in the derivation of a debugging priority. After some initializations, the CDFG of the DUV based on the HDL code in Fig. 1 is constructed, as shown in Fig. 9(a). Then, a backward traversal from PO1 at

t = 1 commences by calling subroutine MVS_Com_for_Vertex,

with the inputs P reviousM V S ={4}, vertex v = “2 : if",

StartP O = PO1, and StartT ime = 1.

When subroutine MVS_Com_for_Vertex is called for the first time, the traversal also reaches vertex “2:if” in time frame t = 1 for the first time. As shown in Fig. 9(b), the recorded MVS(2 : if@t = 1) ={4}, and no MVSforRecovery is recorded. Vertex “2:if” in time frame t = 1 is a control vertex. Therefore, there are two fan-in vertices “2 :<” and “3 :=” for further back-ward traversals. We decide to traverse “2 :<” before traversing “3 :=” and compute CurrentMVS. Because PreviousMVS is {4}, the MVS computation for the conditional statements will be used, and we obtain CurrentMVS {1}. The computation process is shown in Fig. 9(b).

Fig. 10. Computation starting from PO1 at t = 5 and t = 15.

Subroutine MVS_Com_for_Vertex is then called for the sec-ond time, and “2 :<” in time frame t = 1 is reached. The computation status is shown in Fig. 9(c). Similar computations are recursively repeated vertex by vertex until vertex “6 :=” in time frame t = 1 is reached. The computation results along the traversal from “2:if” to “6 :=” are shown in Fig. 9(d), where each set of integers next to an edge is the recorded MVS. Since vertex “6 :=” in time frame t = 1 has no

fan-in vertex, the computation will traverse another fan-fan-in vertex

“3 :=” of vertex “2:if.” The repetitious calling of subrou-tine MVS_Com_for_Vertex can produce the results shown in Fig. 9(e). After completing the traversals and MVS computa-tions starting from PO1 in time frame t = 1, continuing the backward-traversal-based MVS computation from PO1 in time frame t = 5 can produce the results shown in Fig. 10(a) and (b). When the computation reaches vertex “5:if” in time frame

(11)

Fig. 11. Debugging priority and the PCS.

t = 1, PreviousMVS{[0 ∼ 15]} will include MVS(5 : if@t =

1) ={[0 ∼ 3]}. The bounded traversal condition is satisfied, and the traversal is bounded here.

After all the POs at all the positive clock edges before the EOE are applied in the MVS computation, the PCS is calculated with formulas based on the computed MVSs. With the PCS, a

debugging priority with the PCS (in round brackets) is obtained,

as shown in Fig. 11. It can be seen that the design error S7 is displayed in the first line. A search for design errors according to this debugging priority will immediately succeed. In the experimental results in Section V, it is also proven that the proposed PCS can efficiently deliver the debugging priority with high accuracy, which greatly reduces both the time and effort required for design error searches in the input error space.

V. EXPERIMENTALRESULT

The experiments were conducted on a subset of the ITC’99 benchmark [23] and four other designs written in Verilog HDL. The four designs are given as follows: pcpu is a simple 32-bit pipelined DLX central processing unit; div16 is a 16-bit divider; blkJ is a controller of a black jack card game, and Mtrx implements a two-by-two matrix multiplication. The number of lines (#line) of all the designs and the number of variables (#var) are presented in Table III.

For every design case, one statement is randomly chosen and injected with an artificial design error based on typical bugs that designers usually induce [24]. With the injected error, a simulation is run until some incorrect values occur on the POs. Then, the error space identification approach proposed in [21]3

is applied to obtain an error space. After that, the CS calcula-tion in [21] and our PCS calculacalcula-tion are both applied to derive two respective debugging priorities for the same error space.

With a debugging priority, the error candidates that a digital circuit designer has to examine before he/she can find a design error are often less than those with blind searching. In a sense, we can think that the size of the error space is thus reduced, as a result of the debugging priority. With respect to the two debugging priorities, since the injected error may have two different ranked orders, the effectiveness of the two debugging priorities on the size reduction of the same error space is also different. To compare the effectiveness of the two debugging priorities, a quantitative metric called the effective size ratio (ESR) is formulated as “the rank of the injected error divided by the number of error candidates in the error space.” The two

3_{We apply the error space identification method proposed in [22]. However,}

the proposed PCS is theoretically applicable to any other error candidate identification methods.

debugging priorities sorted with the CS and PCS have their own different ESRs. A smaller ESR means that the error has a better rank with respect to the size of the error space. This also implies that the effective size reduction contributed by the corresponding debugging priority is larger and that the efforts required for design error searching in the error space are less.

With each design case, the aforementioned experimental processes are repeated 50 times. In each repetition, the ESRs of the CS and PCS are calculated and recorded. The average ESR values with respect to the CS and PCS are also presented in the “Avg_ESR_CS” and “Avg_ESR_PCS” columns, respectively. The number of times in which the ESR values of the CS and PCS appear in three ESR value ranges is also recorded to show the distribution of the ESR values. The three ESR ranges are “ESR < 0.2(0∼ 0.2),” “0.2 < ESR < 0.5 (0.2 ∼ 0.5), ” and “0.5 < ESR(0.5∼ 1.0).” The number of times is presented in the “#case_CS” and “#case_PCS” columns.

In Table III, it can be observed that, when the PCS is used to obtain a debugging priority, in all the design cases, the average ESR values are all less than 0.2 and the average ESR values of the CS. For example, in design “B02,” if the CS is used to de-rive the debugging priority, the ESR value is less than 0.2 in 38 of the 50 experiments. In other words, our created errors are placed within the first 20% of the displayed list of error space in 38 of the 50 experiments. However, if the PCS is applied in-stead, in each of the 50 repetitions, the injected error always ap-pears in the first 20%. If a designer conducts error searching on design “B02,” with the debugging priority sorted with the PCS, he/she can locate the error by checking less than 20% of the derived error candidates. At least 80% of the searching effort is saved. Moreover, it can be seen that the ESR values of the PCS is never greater than 0.5 in the 50 repetitions. This means that, even in the worst case of the 50 repetitions, a debugging priority sorted with the PCS can still save more than half the amount of effort needed for design error searching in the error space. In contrast, the CS method cannot offer this benefit.

From the values of Avg_ESR_PCS and Avg_ESR_CS, it can be observed that the effective size reduction with respect to the PCS is much greater than that with respect to the CS. The ratio of Avg_ESR_PCS to Avg_ESR_CS shown in the “ESR Ratio” column is about 0.49 in average and 0.38 in the best case; this means that, with the PCS, further size reduction of 50%, on average, is possible, and a size reduction of 62% is also achieved in the best case, compared to the CS.

In design cases “B01” and “B02,” it can be observed that the ESR ratio is not as small as that in other cases. In these two cases, there are few operations that can mask erroneous effects. In other words, the erroneous effects are seldom masked in the two cases, making the advantage of calculating the PCS not obvious. However, in most designs, the HDL operations that can mask erroneous effects, such as “==,” “>,” “[:],”and “,” are widely applied. Using the PCS in those cases should be beneficial. As demonstrated in other design cases, the proposed PCS method saves more time in the error-searching process than the CS method, showing that the PCS is more suitable to derive the debugging priority in most cases.

Another interesting observation is also found in our experi-ments. In the design case “pcpu,” we observe that the design

(12)

TABLE III

COMPARISON OF THECSANDPCS PERFORMANCES

Fig. 12. PCS values of the design errors and observation cycles.

errors with smaller PCS values often take more simulation cycles to observe their erroneous effect on the POs, as shown in Fig. 12, after the errors are exercised. Similar correlations also exist in design cases with many operations that can mask erroneous effects in our experiments. However, for small design cases with few operations that can mask erroneous effects, such as design cases “B02” and “B01,” such relationship may not be obvious.

The cost of this performance improvement is little computa-tion time, compared to the CS. The computacomputa-tion time needed for the two measurements PCS and CS is presented in the “t(s)” columns. It can be seen that, in the worst case, it takes additional 2 s to obtain the PCS, compared to the time required to obtain the CS (4.1 s). The extra computation time is acceptable if we notice that it should usually take more than 2 s for a designer to examine one error candidate to justify its correctness, but the number of examinations saved as a result of applying the PCS is numerous.

VI. CONCLUSION

In this paper, a probabilistic measurement PCS for deriving an accurate and reliable debugging priority for quick error searching among error candidates in an error space has been presented. Instead of assuming that the erroneous effects caused by some activated errors are seldom masked, the proposed PCS takes the masking error situation into consideration and

estimates the LOEP of an error candidate. The idea is, if the LOEP is high, a masking error situation is unlikely to occur, and the error candidate is a false candidate with high possibility, i.e., the candidate tends to be a correct statement. On the other hand, if the LOEP is low, occurrence of a masking error situation becomes quite possible. Suspicion of the error candidate still re-mains, and this candidate should thus receive a low PCS score. The experimental results confirm that the proposed PCS measurement is indeed accurate in estimating the likelihood of correctness of error candidates. In most experimental cases, the created design errors can be located in the first few lines of the candidate list of the input error space. As a result, the debugging priority sorted with the proposed PCS can effectively speed up the error-searching process in the input

error space. As compared to the CS, the proposed PCS-based debugging priority can save more than half of the effort (or

time) needed for the error-searching process in an error space in our experiments, at the cost of a little extra computation time. The savings in time contributed by the proposed PCS method is usually much larger than the extra computation time that the PCS calculation needs. Therefore, the gain from the proposed PCS can often outweigh the cost of the extra computation time that the PCS needs.

REFERENCES

[1] C. C. Lin, K. C. Chen, S. C. Chang, and M. M. Sadowska, “Logic syn-thesis for engineering change,” in Proc. IEEE/ACM Des. Autom. Conf., Jun. 1995, pp. 647–652.

[2] H. T. Liaw, J. H. Taih, and C. S. Lin, “Efficient automatic diagnosis of digital circuits,” in Proc. Int. Conf. Comput. Aided Des., Nov. 1990, pp. 464–467.

[3] D. Brand, “Incremental synthesis,” in Proc. Int. Conf. Comput. Aided

Des., Nov. 1994, pp. 14–18.

[4] M. Tomita and H. H. Jiang, “An algorithm for locating logic design errors,” in Proc. IEEE/ACM Des. Autom. Conf., Jun. 1990, pp. 468–471. [5] P. Y. Chung, Y. M. Wang, and I. N. Hajj, “Diagnosis and correction of

logic design errors in digital circuits,” in Proc. IEEE/ACM Des. Autom.

Conf., Jun. 1993, pp. 503–508.

[6] A. Smith, A. Veneris, M. F. Ali, and A. Viglas, “Fault diagnosis and logic debugging using Boolean satisfiability,” IEEE Trans. Comput.-Aided

(13)

[7] S. Y. Huang, K. T. Cheng, K. C. Chen, and D. I. Cheng, “ErrorTracer: A fault simulation-based approach to design error diagnosis,” in Proc. IEEE

Int. Test Conf., 1997, pp. 974–981.

[8] S. Y. Huang and K. T. Cheng, “ErrorTracer: Design error diagnosis based on fault simulation techniques,” IEEE Trans. Comput.-Aided Design

Integr. Circuits Syst., vol. 18, no. 9, pp. 1341–1352, Sep. 1999.

[9] D. W. Hoffmann and T. Kropf, “Efficient design error correction of digital circuits,” in Proc. Int. Conf. Comput. Des., Sep. 2000, pp. 465–472. [10] A. D’Souza and M. Hsiao, “Error diagnosis of sequential circuits using

region-based model,” in Proc. 14th Int. Conf. VLSI Des., Jan. 2001, pp. 103–108.

[11] N. Sridhar and M. S. Hsiao, “On efficient error diagnosis of digital cir-cuits,” in Proc. Int. Test Conf., 2001, pp. 678–687.

[12] A. Veneris, S. Venkataraman, I. N. Hajj, and W. K. Fuchs, “Multi-ple design error diagnosis and correction,” in Proc. VLSI Test Symp., Apr. 1999, pp. 58–63.

[13] V. Boppana, I. Ghosh, R. Mukherjee, J. Jain, and M. Fujita, “Hierarchical error diagnosis targeting RTL circuit,” in Proc. Int. Conf. VLSI Des., Jan. 2000, pp. 436–441.

[14] V. Boppana and M. Fujita, “Modeling the unknown! Towards model-independent fault and error diagnosis,” in Proc. Int. Test Conf., 1998, pp. 1094–1101.

[15] M. Khalil, Y. Le Traon, and C. Robach, “Towards an automatic diagnosis for high-level validation,” in Proc. Int. Test Conf., 1998, pp. 1010–1018. [16] M. Khalil, O. Benkahla, and C. Robach, “An experimental

compari-son of software diagnosis methods,” in Proc. 25th EUROMICRO Conf., Sep. 1999, pp. 146–149.

[17] C. H. Shi and J. Y. Jou, “An efficient approach for error diagnosis in HDL design,” in Proc. Int. Symp. Circuits Syst., May 2003, pp. IV-732–IV-735. [18] B. R. Huang, T. J. Tsai, and C. N. Liu, “On debugging assistance in assertion-based verification,” in Proc. 12th Workshop Synthesis Syst.

Integr. Mixed Inform. Technol., Oct. 2004, pp. 290–295.

[19] Y. C. Hsu, B. Tabbara, Y. A. Chen, and F. Tsai, “Advanced technique for RTL debugging,” in Proc. IEEE/ACM Des. Autom. Conf., Jun. 2003, pp. 362–367.

[20] B. Peischl and F. Wotawa, “Automated source-level error localization in hardware designs,” IEEE Des. Test Comput., vol. 23, no. 1, pp. 8–19, Jan./Feb. 2006.

[21] T. Y. Jiang, C. N. Liu, and J. Y. Jou, “Effective error diagnosis for RTL designs in HDLs,” in Proc. 11th Asia Test Symp., 2002, pp. 362–367. [22] F. Fallah, S. Devadas, and K. Keutzer, “OCCOM—Efficient computation

of observability-based code coverage metrics for functional verification,”

IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 20, no. 8,

pp. 1003–1015, Aug. 2001.

[23] F. Corno, M. S. Reorda, and G. Squillero, “RT-level ITC’99 bench-marks and first ATPG results,” IEEE Des. Test Comput., vol. 17, no. 3, pp. 44–53, Jul.–Sep. 2000. [Online]. Available: http://www.cad.polito.it/ tools/itc99.html

[24] A. Offutt and G. Rothermel, “An experimental evaluation of selective mutation,” in Proc. Int. Conf. Softw. Eng., May 1993, pp. 100–107.

Tai-Ying Jiang received the B.S. degree in electrical

engineering from National Tsing Hua University, Hsinchu, Taiwan, in 1999 and the M.S. degree in electronics engineering from National Chiao Tung University, Hsinchu, in 2001. He is currently work-ing toward the Ph.D. degree in electronics engineer-ing in the Department of Electronics Engineerengineer-ing, National Chiao Tung University.

His research interests include functional validation and semiformal verification for HDL designs and error diagnosis for HDL designs.

Chien-Nan Jimmy Liu received the B.S. and Ph.D.

degrees in electronics engineering from National Chiao Tung University, Hsinchu, Taiwan.

He is currently an Assistant Professor with the De-partment of Electrical Engineering, National Central University, Jhongli City, Taiwan. His research inter-ests include functional verification for HDL designs, high-level power modeling, and analog behavioral models for system verification.

Prof. Liu is a member of Phi Tau Phi.

Jing-Yang Jou (S’82–M’83–SM’02–F’05) received

the B.S. degree in electrical engineering from Na-tional Taiwan University, Taipei, Taiwan, in 1979 and the M.S. and Ph.D. degrees in computer science from the University of Illinois, Urbana-Champaign, in 1983 and 1985, respectively.

He was with GTE Laboratories from 1985 to 1986 and AT&T Bell Laboratories, Murray Hill, KY, from 1986 to 1994. From 2000 to 2003, he was a Full Professor and the Chairman of the Department of Electronics Engineering, National Chiao Tung University, Hsinchu, Taiwan. From February 2004 to March 2007, he was the Director General of the National Chip Implementation Center, National Applied Research Laboratories, Hsinchu. Since April 2007, he has also been the Executive Director of the National SoC Program. He is with the Department of Electronics Engineering, National Chiao Tung University, Hsinchu 300, Taiwan and is Vice Chancellor (Academic Affairs) of the University System of Taiwan (consisting of four research universities: National Central University, Jhongli City, Taiwan; National Chiao Tung University; National Tsing Hua University, Hsinchu; and National Yang Ming University, Taipei). He has authored more than 160 technical papers. His research interests include logic and physical syn-thesis, design verification, computer-aided design for low power, and network on chips.

Dr. Jou currently serves as Associate Editor for the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS. He was the Technical Program Chair of the 2007 International Symposium on VLSI Design, Automation, and Test (VLSI-DAT), the 12th VLSI Design/CAD Symposium (2001), and the Asia-Pacific Conference on Hardware Description Languages (APCHDL’97). He was also the Honorary Chair of the 2006 International Workshop on Multi-Project Chip and the Executive Chair of the 2nd Taiwan–Japan Microelectronics International Symposium (2002). He was also elected President of the Taiwan Integrated Circuit Design Society during 2007–2008. He was the recipient of the Distinguished Paper Award of the IEEE International Conference on Computer-Aided Design in 1990; the Outstanding Academy–Industry Cooperation Achievement Award from the Ministry of Ed-ucation, Taiwan, in 2002; and the Outstanding Electrical Engineering Professor Award from the Chinese Institute of Electrical Engineering in 2006.