今年度本計畫共產出下列一篇期刊論文與一篇會議論文:
1. Tsung-His Chiang and Lan-Rong Dung, 2006, June, “System Level Verification on High-Level Synthesis of Dataflow Algorithms Using Petri Net,” WSEAS transactions on Circuits and Systems, Issue 6, Vol. 5, pp.790-796
2. Tsung-Hsi Chiang, and Lan-Rong Dung, “System-Level Verification on High-Level Synthesis of Dataflow Graph,” ISCAS 2006.
以下是研究成果內容:
System Level Verification on High-Level Synthesis of Dataflow Algorithms Using Petri Net
Tsung-Hsi Chiang & Lan-Rong Dung Department of Electrical and Control Engineering
National Chiao Tung University 300, Hsinchu City
Taiwan, R.O.C.
Abstract: - This paper presents a formal verification algorithm using the Petri Net theory to detect design errors for high-level synthesis of dataflow algorithms. Typically, given a dataflow algorithm and a set of architectural constraints, the high-level synthesis per-forms algorithmic transformation and produces the optimal scheduling. How to verify the correctness of high-level synthesis becomes a key issue before mapping the synthesis results onto a silicon. Many tools exist for RTL design, but few for high-level synthesis. Instead of applying boolean algebra, this paper adopts the Petri Net (PN) theory to verify the correctness of the synthesis result, because the Petri Net model has the nature of dataflow algorithms. Herein, we propose two approaches to realize the PN-based formal verification algorithm and conclude the best one who outperforms the others in terms of processing speed and resource usage.
Key-Words: - Formal verification; high-level synthesis; Petri Net; dataflow graph
1 Introduction
With increasing design complexity of digital signal processing system, verification becomes a more and more important within the design flow. In modern circuits, it is observed that up to 80% of the overall design costs are due to verification. Formal verification techniques which ensure 100% coverage of function and system model correctness have gained large attention. In [1,2], authors give excellent survey of major trends of formal verification techniques which can be classified into two categories, equivalence checking [2] and model checking [3]. Equivalence checking is used to proof the functional equivalence of two design representations modeled at the same or different levels of abstraction. Model checking is a process that checks the correctness of a design model with given properties. Although formal verification for logic synthesis has been studied very extensively, little work has been done for high-level synthesis.
This paper presents a novel verification algorithm to verify high-level synthesis (HLS) of dataflow algorithms. Given a dataflow graph (DFG) and architectural constraints, the HLS aims to generate the task schedule with processor assignment.
Typically, the HLS performs algorithmic transformation, such as retiming, scaling, and unfolding, on the DFG to meet the architectural constraints, and allocates resources accordingly [4,5,6,7,8]. Both algorithmic transformation and resource allocation require complex procedures. These procedures are rather heuristic and error-prone. The integer linear programming (ILP), for instance, is one of the popular techniques applied for HLS. The success of ILP is relied on the completeness of clauses. Any mistake or incomplete description made in the ILP may result in an illegal solution and screw up following synthesis results. This paper intends to present a formal verification algorithm to unveil the faults produced in HLS.
In the proposed algorithm, we employed the Petri-Net model as the formal description to check the correctness of dataflow behavior. Petri Net model has the nature of dataflow computing, and hence a good tool for the verification of algorithmic transformations and datapath scheduling. The use of the Petri-Net is two-folded. First, the Petri-Net model of dataflow algorithm can hold the data dependence and hence any legal transformation has to conform to the firing rules of the Petri-Net model. Secondly, the scheduling candidate is correct if and only if the initiation of each task is allowed in the Petri-Net model. Comparing with the traditional model checking techniques, the first use can provide simple but thorough model for restructured algorithms while the second use can avoid false negative problems.
The inputs to the proposed formal verification are the system description and task schedule. The system description is basically a fully-specified flow graph (FSFG) [9].
The FSFG represents the behavioral specification of the dataflow algorithm which is also a design entry of HLS. In HLS, to meet the architectural constraints, the algorithmic transformation normally reconstructs the initial FSFG to find out optimal scheduling results. The reconstructed FSFG is admissible if and only if it is equivalent to the initial FSFG. To verify the correctness of the task schedule, the proposed algorithm first converts the initial FSFG to a Petri-Net model which is expressed by Petri-Net characteristic matrix, because any admissible reconstructed FSFG has to have the same characteristic matrix.
Another input is the schedule, the DUV (design under verification), generated by HLS. The schedule is expressed in the format of processor-time chart (or chart).
The chart equally shows the firing sequence. The proposed verification uses the firing sequence to unveil the legal algorithmic transformations applied for the original FSFG. The legal algorithmic transformations will then be candidates to trace the firing sequence of the given schedule.
Based on the inputs, the proposed verification first extracts the initial firing pattern and uses it to determine the candidate reconstructed FSFGs. The candidates will then be verified with the Petri-Net model. If there exist at least one candidate who can allow the firing sequence to execute legally (without against the firing rules), the HLS result is claimed as a correct solution; otherwise, the verification will show the counter example in proof of the incorrectness. In this paper, we propose two approaches to realize the PN-based formal verification and conclude the best one who outperforms the others in terms of processing speed and resource usage.
The remainder of this paper is organized as follows. Section 2 describes some useful definition and proposed modeling technique. The proposed high-level verification technique and verification algorithms are presented in section 3. In section 4, we discuss the complexity analysis of two verification algorithms. In section 5 some experimental results are given. Section 6 gives the conclusions of this paper.
2 Definition and Modeling
In this section, we will discuss some useful definitions and proposed transformation technique to transform a FSFG into PN model.
2.1 Fully-Specified Signal Flow Graph
Fully-Specified Signal Flow Graph (FSFG) [9] or DFG is a natural paradigm for describing DSP algorithms. A FSFG GFSFG(V,E,D), where V={v1,…,vn} and E={e1,…,em}, is a three-tuple directed and edge-weighted graph which contains a vertex set V, a directed edge set E, and an ideal delay set D. Vertex set V represents
atomic operation of functional units. A vertex may have a zero execution delay, such as the signal duplicator, or may be assumed to take non-zero unit time, such as adder or multiplier. Directed edge set E describes the direction of flow of data between functional units. Inter data dependencies between functional units are denoted by weighted edges. Figure 1, for instance, shows a third-order IIR filter in the form of FSFG.
Fig. 1. A third-order IIR filter in the form of FSFG.
2.2 Petri Net Model
A Petri Net GPN(P,T,W,M) is a four-tuple [10], where P={p1,…,pn} and T={t1,…,tm} are finite sets of place and transition, W is the weighted flow relation, and M0 is the initial marking. A marking is a function M:P→ZZ. If M(pi)=k for place pi, we will say that pi
is marked with k tokens. If W(u,v)>0, then there is an arc from u to v with weight W(u,v). Usually, matrix representation gives a complete characterization of Petri Net.
The characteristic matrix of PN is defined by incidence matrix A, which is a
|P|×|T|-matrix with entries
ij j i i j
A =W tr p , −W p tr , (2)
Marking m0 is an |P|×1 column vector with entities m0(i)=M(pi), ∀pi ∈P. We say that is a valid marking if and only if m0(pi)≥0. Let xj={trj}=(…,0,1,0,…) be the unit |T|×1 column vector, which is zero everywhere except in the j-th element. Transition tj can be represented by the column vector xj. We say that tj is enabled at a marking m0 if m0≥A⋅xi for every element of m0 is a non-negative integer. And the result m′ of firing enabled transition tj in a marking m0 is represented by
0 j
m′ =m + ⋅A x (3)
2.3 Transformation from FSFG to PN
The FSFG is attractive to algorithm developers because it directly models the equations of DSP algorithm. Yet, it does not sufficiently unveil the dynamical behaviour and the implementation limits in terms of the degree of parallelism and the memory requirement. Thus, we use Petri Net to model DSP algorithms. It also allows us to discover the characteristic of the target architecture and to observe the dynamical behaviour of the algorithm.
The FSFG GFSFG(V,E,D) of a DSP algorithm can be modelled as PN GPN(P,T,W,M0) by applying following rules. First, functional element set V and edge E of FSFG can be transformed into the transition set T and the place P with respect. Since, each place in PN has only one output, the pseudo transition of each fork edge will be added as source duplicators. At last, the delay element set D of edge in FSFG is corresponded to the number of tokens of place in PN. By applying above transition rules, an example in Figure 2 shows the PN model of the third-order IIR filter in Figure 1. The characteristic matrix A with the initial marking m shows the matrix representation of the PN model, for instance.
Fig. 2. A PN graph and the matrix representation to the third-order IIR filter of Figure 1.
2.4 Schedules to the FSFG
In HLS, a FSFG design may contain cycles to model a DSP application with loops.
The intra-iteration precedence relation is represented by the edge without delay and the inter-iteration precedence relation is represented by the edge with delays. Given an edge e(vi,vj)∈E in FSFG design, d(e) means the data used as inputs in node vj are generated by node vi at d(e) inter-iteration before. A static schedule of a cycle FSFG is a repeated pattern of an execution of the corresponding loop. And a static schedule must obey the precedence relations of the directed acyclic graph (DAG) portion of a FSFG design that is obtained by removing all edges with delays from that FSFG.
Let dji
be the execution delay for each task node opji
, the length le(S) of a schedule S is the latest finish time of all the operations scheduled, that is le(S)=max{ϕ(opji
)+dji-1|∀ opji∈V}. For each task node opji∈V, a schedule of the FSFG design is given as following:
■ Start time: tij =ϕopij, : → Ζ = , ,Kϕ V + {1 2 }
■ The earliest task-finished step:
{ }
min i i 1 | i
etf j j j
t = ϕop +d − ∀op ∈V
3 High-Level Verification
In this section, proposed two-stage verification technique is introduced. The algorithms to both stages are also presented separately as the implementations.
3.1 Verification Flow
A flowchart illustrating our verification flow is shown in Figure 3. There are two inputs to the flow: a given schedule and the original FSFG. The given schedule is the DUV (design under verification) that needs to be verified. The original FSFG reserves the characteristics of the system that the DUV must be satisfied. The proposed verification method tries to find the correct restructured FSFG, which is candidate to the DUV at the first stage, and then, it checks whether the execution sequence, the DUV, of the PN model corresponded to the candidate is satisfied at the second stage.
Before introducing two-stage verification method, we address the preprocessing on both inputs separately.
One of the inputs is the given schedule. In system-level design flow, designers may use unfolding algorithm to pursue perfect FSFG achieving iteration period bound on their original FSFG design. Usually, the FSFG of the DSP algorithm describes one iteration of the computation. By applying unfolding algorithm on the FSFG is to unfold the original FSFG by a factor f which implies f consecutive iterations of
the design. In contrast, we perform unfolding checking in our verification flow to detect the unfolding factor f from given schedule. Another input to the verification flow is the original FSFG graph. It is transformed into a PN model by proposed transformation rules.
In PN domain, the markings, which can be reached from the initial marking, can be seen as the retimed FSFGs of the original design. Some reachable markings are the correct restructured FSFGs for the given schedule. These markings, which dominate the correctness of the given schedule, are said to be the candidate markings. In order to find the candidate markings, Breadth-First algorithm is used to traverse all the markings of PN reachability tree at the first stage. If the candidate marking does not exist, it means the correct retimed FSFG does not exist, it reports the given schedule is not valid due to absent of the candidate marking. If the candidate marking exists, we continuously apply Depth-First algorithm on each candidate marking.
Fig. 3. Flowchart for the proposed high-level verification method.
The given schedule is valid if there exists an initial marking, the candidate marking, of the PN model leading a firing sequence of the schedule valid. At the second state, we apply Depth-First traverse procedure on each candidate marking to check whether the given schedule is valid by checking the firing sequence of the
schedule. At last, if it exists such candidate marking, the flow is done and reports given schedule is valid, or a counterexample of invalid firing sequence is reported if given schedule is invalid.
3.2 The Candidate Marking
The candidate marking set is a subset of the reachable marking set of a Petri Net. A candidate marking is probably the correct initial marking, it also means correct retimed FSFG, which leads the firing sequence of a given schedule being valid. Let S be a schedule of a FSFG. The earliest task-finished set etf_set of S are the tasks which are finished at the earliest task-finished step tetf in S, such that
{
ij| ij etf ij}
etf_set= op ε = , ∀t op ∈ .V (4)
Marking m is defined as a candidate marking, if marking m and firing sequence σ satisfies Definition 1.
Definition 1 Marking m is said to be a candidate marking if and only if there exists a firing sequence σ:tr1…trk, such that for all tasks op∈etf_set are covered by all the transitions in σ, i.e. etf_set⊆σ. And it is also satisfied that each firing transition trj∈σ is either a pseudo transition, dj=0, or an earliest task-finished transition, εj=tetf.
Fig. 4. Check whether a marking is a candidate marking.
Fig. 5. An example schedule and 2nd order IIR filter.
As an example, Figure 4 shows the procedure to check whether a marking is candidate. For a given schedule in Figure 5, the earliest task-finished step is tetf=1, and the earliest task-finished set is etf_set={opji|εji
=tetf=1}={v5,v9}. Marking m = [ 0 0 0 0 0 1 1 2 0 0 0 0 0 0 ] is said to be a candidate marking of the corresponded PN model.
Since, there exists a firing sequence σ:tr10 tr10 tr5 tr9, m→σ m4, such that etf_set⊆σ. The markings, m1…mk, of the firing sequence, 10 10 5 9
1 2 3 4
tr tr tr tr
m→m→m →m →m, are valid states.
3.3 Marking sets
The proposed high-level verification method includes two stages: the Breadth-First and the Depth-First traverse procedures. At the first stage, the Breadth-First traverse procedure tries to find candidate markings, the correct retimed FSFGs, from reachability tree. At the second stage, the Depth-First traverse procedure verifies given schedule by checking the candidate markings. Since, the nodes of reachability tree are exponential growth with the height of the tree, two-stage method is the better
policy. The verification method shortens the searching space by finding candidate markings at the first stage. At the second stage, it verifies given schedule by checking candidate markings rather than all the reachable markings of reachability tree.
Fig. 6. The relation between reachable, candidate and solution marking sets.
Assuming there are n operations in a given FSFG, and hence there are n transitions in the corresponded PN model. Let f be the unfolding factor of a given schedule while designers performing unfolding technique on their FSFG design. At the first stage, the procedure tries to find the candidate marking set from the reachable marking set from the reachability tree and fires each transition once each time. The height of each marking in reachability tree is the distance from the root node to itself. Since, during one iteration period of the schedule S, le(S), each scheduled task must be fired once, the height can also be seen as the number of transitions that have been fired since the root node. Thus, for an n-tasks schedule, the upper height-bound of the reachability tree is bounded by Hup=f⋅n. At the second stage, it continually finds the solution marking set from the candidate marking set. The set relation between three marking sets is shown in Figure 6, that is S3⊆S2⊆S1. The purpose of the first stage is trying to reduce the searching space from reachable marking set S1 to candidate marking set S2, while the second stage is trying to find solution marking set S3 from candidate marking set S2.
3.4 First stage: Breadth-First Traverse
At the first stage of the verification method, we apply Breadth-First traverse procedure to find the candidate markings from the reachability tree. Two approaches, which include the early-terminated and the optimal approaches, are proposed in this paper and discussed in the following sections.
3.4.1 The early-terminated approach
The second approach to verify a schedule of a given FSFG is called the
early-terminated approach which improves the exhaustive approach. Before introducing the improved approach, we first consider Lemma 1.
Lemma 1 Let Ttree be a reachability tree which is bounded by upper height-bound Hup and m1 be any one of the candidate markings in Ttree. For any other candidate marking m2 in the successor path of marking m1, m2 is in the solution marking set S3 if and only if m1 is in S3.
Fig.8. The traverse order of early-terminated approach.
The early-terminated approach uses Lemma 1. It tries to minimize the size of candidate set S2 from reachable set S1. The difference between the exhaustive and the early-terminated approaches is that when an enqueued unvisited marking is candidate, the early-terminated approach ignores the candidate marking and marks as a visited node. Then, it proceeds other unvisited nodes in queue Q until all the markings have been visited. In Figure 8, as an example, the traverse order of the early-terminated approach is m, m1, m2, m3, …, m10.
3.4.2 The optimal approach
The second approach to verify a schedule of a given FSFG is the optimal approach which is improved from the early-terminated approach. In order to reduce reachable marking set S1 of the reachability tree, it tries to merge the redundant nodes when it proceeds Breadth-First traverse.
Let m be an unvisited node to be processed. If m is a candidate marking, it ignores this node by using Lemma 1 and proceeds other unvisited nodes in queue. If m is not a candidate marking, it applies negative test to find enabled set of transitions and
creates new node on each enabled transition. For each new produced node with marking m′, if there exists another node in the reachability tree, and has the same marking associated with it, then the node with marking m′ is a duplicate node. Since, the marking m′ has appeared in the tree, this new produced node is redundant. Then, it
creates new node on each enabled transition. For each new produced node with marking m′, if there exists another node in the reachability tree, and has the same marking associated with it, then the node with marking m′ is a duplicate node. Since, the marking m′ has appeared in the tree, this new produced node is redundant. Then, it