Compliance verification with the spec FSM

Chapter 2 Preliminaries

2.5 Specification FSM

2.5.5 Compliance verification with the spec FSM

By introducing the spec FSM, a DUV is said to be compliant with the specification if and only if there exists no valid input sequence (of any arbitrary length) along with the corresponding DUV output sequence that can drive the spec FSM into its state vio.

Chapter 3 Our Approach

3.1 Problem formulation

Now the problem of interface compliance verification can be interpreted as the compliance verification between the DUV FSM and the spec FSM. It can be further formulated as:

Given the spec FSM Ms = (Qs, Σs, φ, σs, qs0) where Σs =Ictrl∪ Octrl

and the DUV FSM Md = (Qd, Σd, ∆d, σd, qd0) where Σd = Ibus ∪ Icore

and ∆d=Obus ∪ Ocore,

verify if the DUV FSM complies with the spec FSM.

For the spec FSM, Σs =Ictrl∪ Octrl infers that it considers only the signals that affect the bus behavior. Other signals like the address or data bus can be neglected to reduce the complexity. For the DUV FSM, Σd = Ibus ∪ Icore and ∆d=Obus ∪ Ocore infers that it can contain self-defined core signals along with bus signals. Therefore, the DUV FSMs of the same protocol may contain different functional logic because of different core functions.

But our algorithm can verify only the protocol part and cleverly neglect the functional logic that differs from design to design.

3.2 Observations

Figure 3-1 shows the FSM of an example AHB slave interface design. Its outputs in Figure 3-1 from left to right are HREADY, HRESP[1], HRESP[0]. When it detects a write request from a master, it moves to the state write and responds OKAY to indicate that the write operation is done. When it detects a read request from a master, it first moves to the state prep to insert a wait cycle, and then moves to the state read and responds OKAY to indicate that the read operation is done at the next cycle. Otherwise, it stays in the state sleep when the slave is not requested or is requested for an IDLE or BUSY transfer.

Figure 3-1. The FSM of an AHB slave interface design.

eS : HSEL + HREADYin +

How do we verify if the FSM in Figure 3-1 complies with the protocol in Figure 2-4?

Note that this is not simply the FSM equivalence checking problem since these two FSMs are intrinsically different (with different I/O sets). Besides, there is neither a subset nor a superset relation between these two FSMs. However, the states in these two FSMs do have some sort of corresponding relations. For example, when the DUV FSM is in the state sleep, the spec FSM may be in the state orig, because both of them mean the slave is not requested. These two states are named as a corresponding state pair.

However, the corresponding relation among states is not always 1-to-1. It can also be n-to-1 or 1-to-n. This is the reason why the DUV FSM is not simply a subset or superset of the spec FSM. For example, the state orig and the state idle/busy in the spec FSM are both able to correspond to the state sleep in the DUV FSM because the DUV responds identically when it is not requested or is requested for an IDLE or BUSY transfer. In addition, the state seq/nseq in the spec FSM is able to correspond to the state read and write in the DUV FSM because they all represent the data transfer requests.

Since this compliance verification is neither equivalence checking nor subset/superset checking, the problem must be solved by other means. Actually, the discussions in this section suggest that the DUV is proved to be compliant if and only if all possible corresponding state pairs are examined and none of them includes the state vio. Hence the issue becomes how to exhaustively explore all possible corresponding state pairs for the given FSMs.

3.3 Solution

An intuitive idea about finding all possible corresponding state pairs is to grow a tree whose nodes are the corresponding state pairs (or state pairs in short). The root node

consists of the two initial states. If there exist certain values of inputs and outputs that make the DUV FSM and the spec FSM move from the initial state to the state A and B, respectively, then node (A, B) is produced as a child of the root node.

Figure 3-2 illustrates the tree-growing process. node1 is the state pair of the initial states in Figure 2-4 and Figure 3-1. Consider the outgoing edge e2 of orig in Figure 2-4 and the outgoing edge eW/100 of sleep in Figure 3-1, the intersection of the Boolean functions of these two edges (forig,seq/nseq • fsleep,write) is not equal to 0.

This infers that there exists certain input along with the corresponding output that can simultaneously drive both transitions. For example, the input

“HSEL=HWRITE=HREADYin=1, HTRANS=NSEQ”, which drives ew, along with the output 100, can drive e2 as well. Hence, node5 (seq/nseq, write) is created as a child of node1.

Similarly, all outgoing edges of orig versus all outgoing edges of sleep have to be considered in the above manner. Then we can get all children of node1 as shown in Figure 3-2.

In this way, the process exhaustively grows all children and all grandchildren of the root node and so on. Finally all possible state pairs are present as nodes in this tree.

However, this tree can have an infinite depth and thus the tree-growing process seems infeasible. Therefore, certain bounding condition is required to prune the tree to be finite without losing any possible corresponding state pair.

In fact, we can stop growing children of a node if this node has been present in the tree.

For example, in Figure 3-2, we do not have to grow children of node2 since they should be the same as those of node1. This bounding condition does not fail to find any possible state pair. Because the sub-trees growing from node1 is identical to those growing from node2, all possible state pairs growing from node2 can also be found in those growing from node1. Hence, growing the tree rooted from node2 is completely unnecessary.

Figure 3-2. The tree-growing process.

3.4 Algorithm

In summary, our branch-and-bound algorithm starts to grow a tree from the initial state pair. It keeps growing child nodes for each node unless the node has been present in the tree or a violation to the protocol is found. The formal description of the algorithm is shown in the next page. In the algorithm, S denotes the stack containing all nodes that have been present in the tree. qs and qd denote the states of the spec FSM and the DUV FSM, respectively. qs0 and qd0 are the initial states. The function grow_tree performs the branch-and-bound operations on the node. The equation fqs,qi• fqd,qj ≠0 means the intersection of the Boolean functions of the selected edges is nonzero. That is, there exists certain I/O value that can make the spec FSM and the DUV FSM move from the current states, qs and qd, to the next states, qi and qj, respectively. If this condition holds and the next state pair has not been present in the tree (i.e., (qi,qj)∉S), the next state pair (qi,qj) is

seq/nseq, prep seq/nseq, write idle/busy, sleep

orig, sleep

orig, sleep 1

2 3 4 5

grow_tree(qi,qj). Note that if the next state of the spec FSM is vio (i.e., qi=vio), the tree-growing process stops and the error trace is given as a counterexample. Furthermore, if the next state of the spec FSM is dc (i.e., qi=dc), the function grow_tree(qi,qj) is not performed since the input value that drives the transitions from qs and qd to qi and qj is illegal, which means this condition is not supposed to take place.

S=φ ;

grow_tree(qs0 ,qd0) ;

grow_tree (qs ,qd) { S = S ∪ (qs ,qd) ;

for i=1 to |eqs| // for all outgoing edges of the state qs for j=1 to |eqd| // for all outgoing edges of the state qd

if ( fqs,qi• fqd,qj≠0 && (qi,qj)∉S ) if (qi≠dc && qi≠vio )

grow_tree(qi ,qj) ; else if (qi == vio)

give a counterexample and exit;

}

3.5 Complexity analysis

3.5.1 Time complexity

The time complexity is estimated by the iteration count in the tree:

iteration count =

∑

^N ^(|e^qs,n^|×|e^qd,n^{|) ,} ( 3 - 1 )

time complexity = O(

∑

= N n 1

(|eqs,n|×|eqd,n|) ) . (3-2)

In the above equations, |eqs,n| and |eqd,n| denote |eqs| and |eqd| at the n-th recursion, and N denotes the number of recursion times. For the worst case,

N = |Qs|×|Qd|, ( 3 - 3 )

|eqs,n| = |Qs|, ( 3 - 4 )

|eqd,n| = |Qd|, ( 3 - 5 )

=> worst-case iteration count = ( |Qs| × |Qd| )² , (3-6)

=> worst-case time complexity = O( ( |Qs| × |Qd| )² ) . (3-7)

Equation (3-3) holds only when all combinations of states in two FSMs are valid state pairs. Equation (3-4) and (3-5) hold only when the graph representations of two FSMs are complete graphs. However, these worst-case conditions rarely occur. Actually, experimental results show that the iteration count is typically far lower than this upper bound.

3.5.2 Space complexity

In the above algorithm, the space complexity is dominated by the size of two stacks.

One is the visited-node stack, i.e., S in the algorithm. The other is the error-trace stack, which is not explicitly specified in the algorithm. The visited-node stack contains the corresponding state pairs that are already visited. The error-trace stack contains the I/O values and state conditions from the initial node to the node that the violation occurs. Note

violation occurs.

The size of the visited-node stack is affected by the number of corresponding state pairs, where as the size of the error-trace stack is affected by the depth of the violation.

But anyhow, the worst-case space complexity of both stacks can be expressed as:

worst-case space complexity = O( |Qs| × |Qd| ). (3-8) Because |Qs|×|Qd| is equal to the maximum number of state pairs, and also equal to the maximum depth of the tree.

Chapter 4 Extension

4.1 Spec EFSM

Timing constraints are not uncommonly encountered in interface protocols. However, they cannot be easily specified in spec FSM sometimes. For example, a simple interface protocol, which defines “ack must be asserted within 16 cycles after req is asserted”, is specified as the spec FSM in Figure 4-1(a). It requires so many states to represent such a simple protocol. Instead, if we specify this protocol in EFSM by introducing a variable count as in Figure 4-1(b), the representation becomes much clearer and easier.

Similar to the spec FSM, the spec EFSM is an EFSM M=(Q, Σ, ∆, x, T, q0, x0) whose behavior is a monitor of certain interface logic where Q contains all normal states along with two extra special states vio and dc, Σ = Ictrl ∪ Octrl, and ∆ = φ. The predicates and actions provide a convenient way to specify timing constraints that are frequently encountered in interface protocols.

(a) (b)

Figure 4-1. Specify the same protocol in (a) FSM and (b) EFSM

4.2 Extended algorithm

Meanwhile, the original algorithm should also be extended to handle the spec EFSM.

Similarly, the extended algorithm grows a tree from each node unless the node has been present in the tree. The only difference now is that the node contains not only a state pair but also the values of the corresponding variables. A simple example is given in Figure 4-2. Figure 4-2(a) gives an erroneous implementation of the interface protocol in Figure 4-1. This design violates the protocol because it may not assert ack within 16 cycles if it sticks in the state wait. One possible scenario for the tree-growing process to find the violation is shown in Figure 4-2(b). The words in each node from left to right denote the spec EFSM state, the DUV state, and the value of the variable count. The nodes in the right branch illustrate the difference between the original algorithm and the extended one.

As shown, the extended algorithm does not stop at the node (ans, wait, 14) although this node has the same state pair as its parent node (ans, wait, 15). The reason is that the extra variable count needs to be considered as well.

req

(a) (b)

Figure 4-2. (a) A wrong implementation of the protocol in Figure 4-1 and (b) the tree generated by the extended algorithm.

The extended algorithm is given below (the modified parts are shaded). Similar to the algorithm in Section 3.4, S contains the nodes that have been present in the tree, and the function extended_grow_tree performs the branch-and-bound operations. But some extension is made here. The node now contains not only qs and qd but also x, the variable value because the variable value has to be considered as well. Moreover, in the function extended_grow_tree, the condition that a child node is created includes not only fqs,qi•fqd,qj

≠0 and (qi, qj, x’)∉S, but also Pqs,qi(x)==true, which means the predicate of current transition edge must be evaluated true. It is because the transitions can never take place if the predicate is not evaluated true. verify the compliance with the following algorithm:

req / ack

S=φ ;

extended_grow_tree(qs0 ,qd0, x0) ;

extended_grow_tree (qs, qd, x) { S = S ∪ (qs, qd, x) ;

for i=1 to |eqs| // for all outgoing edges of the state qs for j=1 to |eqd| { // for all outgoing edges of the state qd x’= Aqs,qi(x); // perform the action to the variable if (fqs,qi•fqd,qj ≠0 && Pqs,qi(x)==true && (qi, qj, x’)∉S) if (qi≠dc && qi≠vio )

extended_grow_tree(qi, qj, x’) ; else if (qi == vio)

give a counterexample and exit ; }

}

4.3 Complexity analysis

4.3.1 Time complexity

The time complexity analysis of the extended algorithm is similar to the discussions in Section 3.5.1. The iteration count is equal to (3-1). That is,

∑

= N n 1

For the worst case,

N = |Qs|×|Qd|×|range(A(x))|, ( 4 - 1 )

|eqs,n| = |Qs|, ( 4 - 2 )

|eqd,n| = |Qd|, ( 4 - 3 )

=> worst-case iteration count = |range(A(x))| × (|Qs|×|Qd|)² , (4-4)

=> worst-case time complexity = O( |range(A(x))| × (|Qs|×|Qd|)²) . (4-5)

The term range(A(x)) denotes the value range of the function A(x), and thus

|range(A(x))| denotes the number of possible values of x. Equation (4-1) holds when all combinations of two states and variable values are possible, and equation (4-2) and (4-3) holds when the graphic representations of two FSMs are complete graphs. Worst case like this rarely happens.

How exactly does the complexity change after this extension? Let us compare (4-5) to (3-7), although (4-5) is multiplied by a factor of |range(A(x))|, |Qs| of (4-5) is greatly reduced in general. Hence, the complexity does not increase significantly as expected.

4.3.2 Space complexity

Since the original state pair (qs, qd) is modified as (qs, qd, x), the original maximum number of state pairs |Qs|×|Qd| should be modified as |range(A(x))|×|Qs|×|Qd| as well.

This is the same case for the maximum depth of the tree. These modifications directly affect the maximum size of the visited-node stack and the error-trace stack, which further changes the worst-case space complexity. Therefore, the worst-case space complexity becomes:

worst-case space complexity = O( |range(A(x))| × |Qs| × |Qd| ). (4-6)

4.4 Other extensions

Except for the EFSM, the FSM has other variations, for example, the interacting FSM.

Theoretically, these variations can be transformed into the equivalent FSMs. Therefore, they can be candidates of other extensions to our approach. For example, some concurrent behaviors of interface protocols can be modeled by the interacting FSM easily. However, the transformation process imposes a heavy burden on the algorithm but improves only little expressing ability. Therefore, only the EFSM extension is implemented since the trade is worthwhile.

Chapter 5 Experimental Results

5.1 Program implementation

The proposed approach has been implemented in C. Our implementation can either formally prove that the given design is fully compliant with certain interface protocol or report a precise input sequence as a counterexample to show how the given design fails in the compliance verification. Additionally, the program can automatically check the correctness of the spec (E)FSM with the method in Section 2.5.2.

5.1.1 Input file format

Both the spec (E)FSM and the DUV FSM are accepted in the enhanced BLIF(Berkeley Logic Interchange Format) [26]. This format extends the BLIF with some syntax capable to describe the reasons of each transition as well as predicates and actions required by the EFSM model. Figure 5-1 shows an example in this format. The words after the character

# are comments. These comments clearly explain the syntax.

Figure 5-1. The spec EFSM of Figure 4-1 in the enhanced BLIF.

The reason column in the enhanced BLIF provides a way to specify the meaning of each transition. With these reasons, we can provide more meaningful counterexamples to improve the compliance verification.

5.1.2 Input file check

Our implementation can check the input files with the schemes described in Section 2.5.2. In the experiments, it really catches several bugs in the original spec FSMs of the

.model the_req_ack_protocol #model name

.inputs req ack #input names

.variables count 0 #variable and its initial value

AHB slave and the WISHBONE slave protocols. One of the bugs is shown in Figure 5-2.

This bug is arisen from omitting the edge e4 in Figure 2-4. That is, this transition is accidentally ignored in the original spec FSM. If we do not check with this method and neglect this transition in the spec FSM, the verification result may not be correct.

Figure 5-2. The report of catching a bug in the spec FSM of AHB slave protocol.

5.1.3 Counterexample

Our implementation gives very useful error-trace messages. It tells not only the I/O values and the state transitions, but also the reason of each transition. The error-trace message when verifying the design in Figure 4-2(a) is given in Figure 5-3. As we can see,

********************************************

Result of pre-processing

********************************************

Pre-process file golden_AHB_slave.txt ...

ERROR!In state orig :

Input 1100000 not specified!

Please modify input file golden_AHB_slave.txt!

Verification aborted! No result reported.

Figure 5-3. The verification result of the design in Figure 4-2(a).

the reason column tells the physical meaning of the current status. These reasons, along with the detail I/O sequence and state transitions, greatly facilitates the debugging of the interface design.

5.2 Result analysis

In order to investigate the effectiveness and efficiency of the proposed algorithm, compliance verification is conducted over a set of real AMBA AHB-compliant and

********************************************

Result of this compliance verification

********************************************

DESIGN VIOLATES SPEC. : Ack_Exceed_16cycles

WISHBONE-compliant [27] designs. In addition, to check whether the proposed algorithm can find the design flaws as expected, some errors are intentionally injected into the design con7 and mac as two additional benchmark designs con7_err and mac_err, respectively. The experimental results are shown in Table 1.

There are some facts worthy to be mentioned in Table 1. As it is shown, all the designs under verification support different functional modes of the protocols, but the same spec (E)FSM can be used to verify all designs of the same protocol without altering.

Furthermore, the two injected errors are successfully found. The error in the design con7_err is caused by a self-loop of the state performing the WAIT response. Thus the design may respond WAIT more than 16 cycles, which is not recommended in the AHB protocol. This is an error that designers are very likely to commit if they don’t deal with the WAIT response carefully. The other error in the design mac_err is a little more complex. When an IDLE transfer is initiated after an ERROR response, the design does not respond OKAY but respond ERROR instead, which causes a violation to the AHB protocol. This error is not uncommon because the two-cycle ERROR response is intrinsically more complicated and error-prone. With our verification approach, these errors and the reasons leading to them are clearly identified.

Table 1. The DUVs and the verification results.

I/F protocol

type DUV result supporting function reason of violation

spi [28] compliance NORMAL and

ERROR response -

WISHBONE slave

(spec FSM) ac3 ctrl[28] compliance NORMAL response -

con7 compliance

con7_err violation OKAY and WAIT

response Wait>16cycles AMBA AHB

slave (spec EFSM)

mac_err violation OKAY and ERROR

response Not_Respond_Okay The complexity comparison is shown in Table 2. It clearly demonstrates that our algorithm can correctly complete the formal compliance verification for all given designs.

在文檔中應用於系統單晶片中介面相符驗證之正規方法 (頁 23-0)