Matching-based algorithm for FPGA channel segmentation design

(1)

Short Papers

_______________________________________________________________________________ Matching-Based Algorithm for FPGA Channel

Segmentation Design

Yao-Wen Chang, Jai-Ming Lin, and M. D. F. Wong

Abstract—Process technology advances have made multimillion gate

field programmable gate arrays (FPGAs) a reality. A key issue that needs to be solved in order for the large-scale FPGAs to realize their full potential lies in the design of their segmentation architectures. Channel segmentation designs have been studied to some degree in much of the literature; the previous methods are based on experimental studies, stochastic models, or analytical analysis. In this paper, we address a new direction for studying segmentation architectures. Our method is based on graph-theoretic formulation. We first formulate a problem of finding the optimal segmentation architecture for two input routing instances and present a polynomial-time optimal algorithm to solve the problem. Based on the solution to the problem, we develop an effective and efficient multi-level matching-based algorithm for general channel segmentation designs. Experimental results show that our method significantly outperforms the previous work. For example, our method achieves average improvements of 18.2% and 8.9% in routability in comparison with other work.

Index Terms—Detailed routing, interconnect, layout, physical design,

routing.

I. INTRODUCTION

Due to their low prototyping cost, user programmability, and short turnaround time, field programmable gate arrays (FPGAs) have be-come a very popular design style for application-specific integrated circuit (ASIC) applications. With the advances in process technology, multimillion gate FPGAs have become available. A key issue that needs to be solved for the large-scale FPGAs to realize their full potential lies in the design of their routing architectures [11].

Fig. 1 shows the row-based FPGA architecture [1], [4]. The archi-tecture is analogous to the traditional standard-cell model. The logic modules, used to implement logic function, are placed in parallel in predefined locations and channels are settled between two neighboring rows of logic modules. Each logic module is linked with vertical seg-ments for input and output. A vertical segment can be connected to a horizontal segment by programming a cross switch (denoted by) to beON. The routing tracks are divided into several segments of different lengths. Two neighboring segments can be connected together to estab-lish a longer connection by programming the incident horizontal switch (denoted by ) to beON.

Unlike the traditional ASIC, the routing resources in an FPGA are prefabricated in the chip and routing in an FPGA is performed by programming switches to make connections. The switches usually

Manuscript received May 28, 1999; revised July 21, 2000. The work of Y.-W. Chang and J.-M. Lin was supported in part by the National Science Council of Taiwan R.O.C. under Grant NSC-87-2215-E-009-041. The work of D. F. Wong was supported in part by the Texas Advanced Research Program under Grant 003658288. This paper was presented in part at the International Conference on Computer-Aided Design, San Jose, CA, November 1998. This paper was recommended by Associate Editor M. Pedram.

Y.-W. Chang and J.-M. Lin are with the Department of Computer and Infor-mation Science, National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C. (e-mail: ywchang@cis.nctu.edu.tw; gis87808@cis.nctu.edu.tw).

M. D. F. Wong is with the Department of Computer Sciences, University of Texas, Austin, TX 78712 USA (e-mail: wong@cs.utexas.edu).

Publisher Item Identifier S 0278-0070(01)03540-0.

Fig. 1. Row-based FPGA architecture.

Fig. 2. Two segmented routing examples. (a) Infeasible one-segment routing example. (b) Feasible routing example.

have high resistance and capacitance and, thus, incur significant delays. As shown in the previous works in [2] and [6], the number of segments/switches, instead of wirelength, used by a net is the most critical factor in controlling the routing delay in an FPGA. To achieve better performance, each track should contain fewer horizontal switches (i.e., each segment has longer length and each track contains fewer segments). However, this would reduce routability and waste more wire. On the other hand, if a track contains more horizontal switches (i.e., each segment has shorter length and each track contains more segments), nets can be routed with more flexibility and less waste of wire. However, this would sacrifice performance. This tradeoff between performance and routability presents a segmentation design problem: How to determine a segmentation distribution to maximize the routability under performance constraints?

Example 1: Fig. 2 shows a set of three netsn₁,n₂, andn₃ to be routed in two different segmented routing channels, each with two tracks. Each horizontal switch partitions a track into two segments. For example, in Fig. 2(a), track 1 consists of two segments (1, 2) and (3, 8) separated by the horizontal switch located between columns 2 and 3. If each net can use at most one segment for routing, then nets n1,n2, andn3can not be routed simultaneously using the segmented channel shown in Fig. 2(a); however, they can be routed if each net is allowed to use up to two segments. On the other hand, the three nets are always routable on the segmented channel shown in Fig. 2(b). This 0278–0070/01$10.00 © 2001 IEEE

(2)

example shows that segmentation designs could deeply influence the routability of an FPGA.

Rose and Hill [11] emphasized that segmentation distribution would become a key challenge in large-scale FPGA design. They pointed out that physical design for a large-scale FPGA would be difficult because the routing delays and resource utilization could not be handled well and, thus, it is hard to realize the full potential of a large-scale FPGA. A well-designed segmentation can reduce not only routing delays, but also waste of wire lengths. Therefore, the segmentation design problem will become even more important when the age of multimillion gates is coming.

Channel segmentation designs have been studied to some degree in much of the literature [5], [8], [10], [12], [13]. El Gamal et al. [5] showed that with appropriate arrangement of segment lengths, a segmented routing channel can achieve comparable routability to a freely customized routing channel (e.g., the routing channel in a standard cell). For the channel segmentation design problem, Roy and Mehendale [12] first presented a stochastic method which approximates a given segment length distribution. Zhu and Wong [13] presented an algorithm for the channel segmentation design problem based also on a stochastic analysis. Given a distribution of nets and routing requirements, they computed the number of segmented tracks of various types required for maximum routability. Pedram et al. [10] presented an analytical model for the design and analysis of effective segmented channel architectures . In their approach, the probability density functions for the origination points and the lengths of connections were defined. Based on these functions, they estimated the track number of each type and analyzed the routability of designed segmented channels. Recently, Mak and Wong [8] enumerated the routing patterns for each net and compared the number of tracks required in a channel to accommodate the largest number of patterns to provide high routability and good delay performance for channel segmentation.

Unlike the previous methods that are based on experimental studies, stochastic models, or analytical analysis, we present in this paper a new direction for studying segmentation architectures for row-based FPGAs. Our method is based on graph-theoretic formulation. We first formulate a net matching problem of finding the optimal segmenta-tion architecture for two input routing instances and present a poly-nomial-time algorithm to solve the problem. Using the solution to the problem as a subroutine, we develop an effective and efficient multi-level matching-based algorithm for general channel segmentation de-signs. Experimental results show that our method significantly outper-forms the previous work. For example, our method achieves average improvements of 8.9% and 18.2% in routability, compared with [8] and [13], respectively. (Note that the most recent work [8] reports the best results among the previous work.)

The remainder of this paper is organized as follows. Section II for-mulates the segmentation design problem. Section III presents our al-gorithms for channel segmentation design. Experimental results are re-ported in Section IV. Finally, we conclude our paper and discuss future research directions in Section V.

II. PROBLEMFORMULATION

The channel segmentation design problem arising from the row-based FPGA architecture is to determine a channel segmentation ar-chitecture to achieve “best” routability under some given constraints (e.g., area and timing constraints). By “best” routability, we mean that the segmentation architecture can accommodate as many routing in-stances as possible. Here, a routing instance consists of a set of nets for routing, which may correspond to routes in a real circuit design or nets generated from some particular net distribution function.

In this paper, we use the following notations.

L Length of a channel, measured in the number of columns. We number the columns from 1 toL + 1.

T Number of tracks in the channel (area constraint).

K Maximum number of segments allowed for routing a single net (timing constraint).

m Number of channel routing instances. n Number of nets in each routing instance.

For the channel segmentation, each net is an interval, which can be characterized by its leftmost and rightmost points. The leftmost and rightmost points of neti are represented by left_iandright_i, respec-tively. The span of net (interval)i is from leftitoright_i, denoted by [lefti; righti]. One net overlaps another if the spans of the two nets in-tersect. A segment covers a net (interval) if the span of the net is within the bound of the segment. A setS of segments covers a routing instance I (i.e., a set of nets) if for each net i in I, there exists a segment s in S that coversi and no two nets are covered by the same segment. We use K to model the timing bound for all nets. For the K-segment routing, each net can use up toK segments. For K = 1, a net can be routed on a segment as long as the segment covers the net. When one segment is assigned to a net, the segment is occupied and not allowed to be used for any other net. It is clear that if two nets overlap, they cannot be routed to the same track. ForK 2, each net can use multiple seg-ments by programming corresponding horizontal switches to connect the segments. However, like one-segment routing, each segment cannot be occupied by more than one net at a time.

The channel segmentation design problem is formulated as follows. 1) Channel Segmentation Design Problem: GivenL, T , K, m, and n, design a channel segmentation to maximize the success rate forK-segment routing.

For a fixedK, we refer to the problem as the K-segmentation design problem. WhenK 2, it is also called the multisegmentation design problem.

Note that our formulation is, in fact, more general than most of the previous work. Most previous work considers only some well-defined net distribution functions (e.g., geometric and Poisson distributions, etc.). Ours, however, can not only handle well-defined distributions, but can also deal with arbitrary routing instances.

III. CHANNELSEGMENTATIONDESIGN

We first describe the motivation and framework for our channel seg-mentation design. Our objective is to construct a segseg-mentation architec-ture that can accommodate different routing instances in various distri-bution. To capture the patterns of these input routing instances, we pro-pose a matching-based algorithm to merge routing instances together to generate a super routing instance. Since the intervals in the super routing instance can accommodate those in each routing instance, a segmentation architecture based on the intervals in the super routing instance would lead to good routing success rate for all input routing instances.

More specifically, givenm sets of routing instances, each with ni nets (intervals), i = 1; . . . ; m, designing a segmentation to maxi-mize the success rate for one-segment routing is closely related to con-structing a setS of segments that can cover each of the m sets of routing instances (one set at a time). It is not difficult to see that using such set S of segments for one-segment routing would result in 100% routing completion. However, there is usually a limitation on the number of tracksT in a routing channel. Therefore, it is not always possible to construct a channel formed by all the segments inS. Nevertheless, the setS of segments still gives a key insight into the optimal segmentation architecture for the given routing instances.

(3)

SinceS gives the optimal segmentation architecture, our goal is to construct a segmentation structure as close toS as possible. Our method is based on graph-theoretic formulation. We first formulate a net matching problem to obtain a most economical set of segments that can cover each of two input routing instances. Based on the weighted bipartite matching theory, we present a polynomial-time optimal algorithm to solve the net matching problem. Using the solution to the problem as a subroutine, we then develop an effective bottom-up matching-based algorithm for the segmentation design for an arbitrary number of input routing instances. We shall first discuss the net matching problem.

A. Net Matching Problem

The net matching problem hopes to find a set of intervals to cover each of two sets of intervals with least length; in other words, if we can optimally solve the net matching problem, we can use it to generate a set of segments with least length for complete routing for each of two routing instances.

LetI be a finite set of intervals (nets). Let i1 = [left1; right1] and i2 = [left2; right2] be two overlapping intervals. It is obvious that their overlapping length, denoted byolen(i₁; i₂), can be computed as follows:

olen(i1; i2) = minfright1; right2g 0 maxfleft1; left2g: (1) We defineMerge(i1; i2) as the interval i = [left; right], where left = minfleft1; left2g and right = maxfright1; right2g. It is clear that the length of intervali, denoted by len(i), is given by right 0 left and the total length of all intervals inI; Length(I) is given by _i2Ilen(i).

Let I and J be two finite sets of intervals. A net matching M between I and J is a set of ordered pairs of intersecting intervals (i1; j1); (i2; j2); . . . ; (ik; jk), where A = fi1; i2; . . . ; ikg and B = fj1; j2; . . . ; jkg are two sets of distinct intervals from I and J, respectively. We can replacei₁ and j₁ by Merge(i₁; j₁), replace i₂ andj2byMerge(i2; j2); . . . ; and replace ikandjkbyMerge(ik; jk). After the replacement, the set of intervals I [ J is represented as follows:

Union(I; J) = (I 0 A) [ (J 0 B) [ fMerge(i1; j1); Merge(i2; j2); . . . ; Merge(ik; jk)g: (2) The net matching problem is described as follows.

1) Net Matching Problem: Given two finite setsI and J of intervals (nets), find a matchingM such that Length(Union(I; J)) is minimized.

Based on the weighted bipartite matching theory, we present a poly-nomial-time optimal algorithm for the net matching problem. We re-duce the problem to computing the maximum matching in a weighted bipartite graph. Given two finite setsI and J of intervals, we construct a weighted bipartite graphG = (U; V; E) as follows. (See Fig. 3 for an illustration.) For each intervali in I (j in J), we introduce a vertex ui(vj) in the set U(V ) of vertices. For each pair of overlapping inter-valsp; q; p 2 I and q 2 J, connect uptovqby an edgeepq= (up; vq) with a weight computed by the weight function : E ! Z+defined as follows:

(epq) = minfrightp; rightqg 0 maxfleftp; leftqg: (3) Then, we can apply a maximum weighted bipartite matching algorithm [9] onG to solve the net matching problem optimally.

Note that(e_pq) gives the overlap length between intervals p and q. Intuitively, this weight function measures the “similarity” between two intervals—the greater the weight, the more similar the two

corre-Fig. 3. Matching and merging example. (a) Two sets of nets. (b) Corresponding weighted bipartite graph. (c) Matching result for the two sets of nets.

sponding intervals. By merging intervals with greatest similarity, we can obtain a most economical (i.e., minimum total length) set of seg-ments that covers each of two input interval sets.

A matching ^M of a graph H = (V; E) is a subset of the edges with the property that no two edges of ^M share the same vertex. Edges in ^M are called matched edges; otherwise, they are unmatched. Let Matched(I; J) be the set of the matched edges in a weighted bipartite matching on the graph induced by the finite setsI and J of intervals andWeight(F ); F E be the total weight of the edges in F . We have the following lemma and theorem.

Lemma 1: Length(Union(I; J)) = Length(I) + Length(J) 0 Weight(Matched(I; J)).

Proof: By (2), Union(I; J) = (I 0 A) [ (J 0 B) [ fMerge(i1; j1); Merge(i2; j2); . . . ; Merge(ik; jk)g, we have

Length(Union(I; J))

= Length((I 0 A) [ (J 0 B) [ fMerge(i1; j1); Merge(i2; j2); . . . ; Merge(ik; jk)g)

= Length(I) 0 Length(A) + Length(J) 0 Length(B) + Length(Merge(i1; j1) + Merge(i2; j2)

+ 1 1 1 + Merge(ik; jk))

= Length(I) + Length(J) 0 Length(A) 0 Length(B) +

k p=1

Length(Merge(ip; jp))

= Length(I) + Length(J) 0 Length(A) 0 Length(B) + k

p=1

(4)

= Length(I) + Length(J) 0 Length(A) 0 Length(B) + k p=1 len(ip) + k p=1 len(jp) 0 k p=1 olen(ip; jp) = Length(I) + Length(J) 0 Length(A) 0 Length(B)

+ Length(A) + Length(B) 0 Weight(Matched(I; J)) = Length(I) + Length(J) 0 Weight(Matched(I; J)):

Theorem 1: The maximum bipartite weighted matching algorithm optimally solves the net matching problem inO((n1+ n2)3) time, wheren1andn2are the numbers of nets in the two input sets.

Proof: Consider two finite sets I1 and I2 of intervals with n1 and n2 nets, respectively. We apply the maximum weighted bipartite matching algorithm to merge these two sets of nets. According to Lemma 1, the total interval length of the resulting merged set, Length(Union(I₁; I₂)), is given by Length(I1) + Length(I2) 0 Weight(Matched(I1; I2)). Since Length(I1) and Length(I2) are fixed, Length(Union(I1; I2)) has the minimum value when Weight(Matched(I1; I2)) is maximized. The weighted bipartite matching algorithm guarantees to find such a maximum value and, thus, the net matching problem is optimally solved. The time complexity of the maximum weighted bipartite matching algorithm isO((n₁+ n₂)3) [9]. The theorem thus follows.

Example 2: Fig. 3(a) shows two sets I = fi1; i2; i3; i4g and J = fj1; j2; j3g of intervals (nets). The induced weighted bipar-tite graph is given in Fig. 3(b). The span of net i; [lefti; righti] is shown next to its corresponding vertex. The weight for each edge is computed by the function and shown beside the edge. The maximum weighted bipartite matching M between U = fu1; u2; u3; u4g and V = fv1; v2; v3g is illustrated in Fig. 3(b) by heavy lines. In this example,M = f(u1; v1); (u2; v3); (u3; v2)g. Note that u₄ is unmatched. Fig. 3(c) shows the resulting con-figuration of replacing i1 and j1 by Merge(i1; j1); i2 and j3 by Merge(i2; j3), and i3 and j2 by Merge(i3; j2). Let l1 = Merge(i1; j1); l2 = Merge(i2; j3); l3 = Merge(i3; j2), and l4 = i4. After the replacement, the set of intervals I [ J becomes Union(I; J) = fl1; l2; l3; l4g. The reader can verify that Length(Union(I; J)) = len(l1) + len(l2) + len(l3) + len(l4) = 3 + 5 + 5 + 2 = 15 is the minimum possible total union length for merging I and J. (Note that Length(Union(I; J)) = Length(I) + Length(J) 0 Weight(Matched(I; J)) = (2 + 4 + 4 + 2) + (3 + 4 + 5) 0 (2 + 4 + 3) = 15).

B. Segmentation Design Algorithm

Our design algorithm consists of three stages: 1) the matching-and-merging stage; 2) the tuning stage; and 3) the filling stage. In the matching-and-merging stage, we repeatedly apply the aforementioned weighted bipartite matching algorithm to merge input routing in-stances and find a setI of intervals that can cover each of the input routing instances. In the tuning stage, we find a set I0 of intervals fromI; I0 I, which can be packed (routed) into T tracks. In the filling stage, we determine the switch locations on the tracks and fill the empty space between each pair of intervals in theT tracks to form a set of segments.

Since the net matching problem guarantees to find a set of inter-vals with least length to cover each of two routing instances, the length of resulting intervals by repeatedly applying this approach to merge all routing instances would be smaller than total intervals of routing instances and it is more feasible to build segmentation for complete routing from resulting intervals than total interval of routing instances.

Fig. 4. Matching process.

The matching and merging stage proceeds in a tree-like bottom-up manner. (The whole matching and merging process is illustrated in Fig. 4.) Given m routing instances R1; R2; . . . ; Rm, each with re-spectiven1; n2; . . . ; nmnets, we apply the aforementioned weighted bipartite matching algorithm to mergeR₁ andR₂,R₃ andR₄, and R5 andR6; . . .. (See the procedure Match and Merge( ) in Lines 5 and 8 of Fig. 6.) Ifm is odd, then Rmremains unmerged. After the merge, the number of resulting instances reduces todm=2e. Then, the same merging process repeats for the newdm=2e routing instances. The process proceeds level by level in a bottom-up manner until a final merged routing instance is obtained (see Fig. 4).

LetI_Fbe the set of the intervals in the final merged routing instance. We have the following theorem.

Theorem 2: I_FcoversR_i; 8i; 1 i m.

Proof: LetI11be the resulting set of intervals for matching and merging two routing instancesR1 andR2in Iteration 1 (see Fig. 4). Supposefi1; . . . ; ipg (fj1; . . . ; jqg) are intervals in R1(R2). Without loss of generality, leti1andj1,i2andj2; . . . ; ik, andjkbe matched in-tervalsk p and k q. Let A = fi₁; . . . ; i_kg and B = fj₁; . . . ; j_kg. It is obvious that i1; . . . ; ik (j1; . . . ; jk) can be covered by Merge(i1; j1); . . . ; Merge(ik; jk), respectively. If k < p (k < q), for each interval infik+1; . . . ; ipg (fjk+1; . . . ; jqg), it can be covered by itself inR10 A (R20 B). By (2), we have I11= Union(R1; R2) = (R10 A) [ (R20 B) [ fMerge(i1; j1); . . . ; Merge(ik; jk)g. Thus, it is obvious that every intervali in R1 (j in R2) can be covered by some intervals inI11 and no two intervals inR1 (R2) are covered by the same segment. Also, R3 and R4, R5 and R6; . . . can be covered by I₁₂; I₁₃; . . ., respectively. Similarly, let I₂₁; I₂₂; . . . be the matching-and-merging results ofI11 andI12, ; I13 andI14; . . .. Then,I₁₁ andI₁₂, I₁₃and I₁₄; . . . can be covered by I₂₁; I₂₂; . . ., respectively. This process continues as the matching-and-merging stage proceeds. It is obvious thatRi; 8i; 1 i m can be covered byIF. The theorem thus follows.

By Theorem 2, using a setS of segments covering IF for one-seg-ment routing can route all routing instancesR1; R2; . . . ; Rm. As men-tioned earlier, however, there is usually a limitation on the number of tracksT in a routing channel. Therefore, it is not always possible to construct a channel formed by all the segments inS.

In the tuning and the filling stages, we construct a segmentation of T tracks from the final merged routing instance IF. First, we apply the basic left-edge algorithm [7] to route the intervals inIF. (See the pro-cedureRoute by Left Edge( ) in Line 11 of Fig. 6.) We then sort the resulting tracks in the nonincreasing order of their total lengths occu-pied by the intervals. The firstT tracks are chosen for further construc-tion and the tuning stage is done. (See the procedureTune Track( ) in Line 12 of Fig. 6.) After the tuning stage, it may contain an empty space between a pair of intervals. In the filling stage, we determine

(5)

Fig. 5. Routable configurations for the segmented track.

Fig. 6. Algorithm for segmentation design.

the switch location to fill each empty space to construct a set of seg-ments. In particular, we intend to optimize not only the routability for the given routing instances (done in the matching-and-merging stage), but also that for unknown instances. We apply the following theorem in the filling stage to guide the placement of horizontal switches to fur-ther optimize the routability for the unknown ones.

Theorem 3: For one-segment routing, a two-segment routing track can cover the maximum number of net patterns when the two segments are of equal length (i.e., the switch is placed in the middle of the two segments).

Proof: Consider a track of lengthn + 1 (see Fig. 5). The column numbers range from 0 ton + 1. Each net under the track can be routed on the segmented track. Suppose the switch is placed between Columns x and x + 1. For one-segment routing, the segment on the left side of the switch can coverx nets of length 1, x 0 1 nets of length 2; . . ., or 1 net of lengthx. Similarly, the segment on the right side of the switch can covern 0 x nets of length 1, n 0 x 0 1 nets of length 2; . . ., or 1 net of length n 0 x. Therefore, the segmented track can cover allx + (x 0 1) + 1 1 1 + 1 nets on the left side of the switch and (n 0 x) + (n 0 x 0 1) + 1 1 1 + 1 nets on the right side of the switch.

Therefore, the number of combinations of nets that can be covered by the segmented track is given by the following functionL:

L(x) = (x + (x 0 1) + 1 1 1 + 1)

2 ((n 0 x) + (n 0 x 0 1) + 1 1 1 + 1) = x(x + 1)₂ (n 0 x)(n 0 x + 1)₂ = x40 2nx3+ (n20 n 0 1)x2+ (n2+ n)x

4 :

L(x) has the extreme values as L0_{(x) = 0; x > 0. We have} L0_{(x) = 4x}3_{0 6nx}2_{+ 2(n}2_{0 n 0 1)x + n(n + 1) = 0}

=) (2x 0 n)(2x2_{0 2nx 0 (n + 1)) = 0} =) x = n=2:

To see thatL(x) has the maximum value as x = n=2, we compute L00_(n=2) L00_{(x) = 12x}2_{0 12nx + 2(n}2_{0 n 0 1)} L00 n₂ = 12 n₂ 20 12n n₂ + 2(n20 n 0 1) = 3n2_{0 6n}2_{+ 2n}2_{0 2n 0 2} = 0n2_{0 2n 0 2} = 0(n + 1)2_{0 1} < 0:

Hence,L(x) has the maximum value when x = n=2.

Therefore, a two-segment routing track can cover the maximum number of nets when the two segments have equal lengths (switch is placed in the middle of track) for one-segment routing.

By Theorem 3, if there are only two intervals on a track that is sep-arated by empty space, we would better place a horizontal switch on the position that makes the two resulting segments most balanced in length. However, the number of intervals in a track is usually larger than two. We therefore process each pair of neighboring intervals serially from left to right according to Theorem 3. The procedure Fill Space( ) listed in Line 15 of Fig. 6 finds a better position for placing a horizontal switch in the empty space, if any, between every two neighboring in-tervals. The whole segmentation design algorithm is summarized in Fig. 6.

(6)

Fig. 7. Time complexity of matching-and-merging process.

Theorem 4: Algorithm Seg Designer runs inO(m3n3) time, where m is the number of input routing instances and n is the maximum number of nets in a routing instance.

Proof: Givenm input routing instances, each with at most n nets by Theorem 1, we needO(n3) time to merge two routing instances in Iteration 1 using the maximum weighted bipartite matching algo-rithm. (See Fig. 7 for the outline of the matching-and-merging process of our method.) In the worst case, if nets in two routing instances do not match at all, the newly generated routing instance will contain2n nets after Iteration 1. Then, it needsO((2n)3) time to solve the net matching problem in the next iteration. Similarly, a routing instance has at most 2dlg me01n nets after Iteration dlg me 0 1 and it needs O((2dlg me01_n)3_{) time to solve the net matching problem in the final} iteration. Therefore, the time complexityT (m; n) of the algorithm is given by T (m; n) m₂ O(n3₎ + m₄ O((2n)3_{) + 1 1 1 + O} ₂dlg me01_n 3 = O m₂ n3_{+ m} 4 (2n)3+ 1 1 1 + 2dlg me01n 3 = O mn₂3 + 2mn3+ 1 1 1 + 23(dlg me01)n3 = O mn2 (4lg m0 1) 4 0 1 = O mn₆3(4lg m_{0 1)} = O mn₆3(m20 1) = O m3₆n3 0 mn₆3 = O m3n3 :

By Theorem 4, the time complexity of our algorithm is given by O(m3_n3_{), where m is the number of input routing instances and n} is the maximum number of nets in an input routing instance. Note that empirically the number of resulting intervals per routing instance grows only linearly as the matching and merging process proceeds, instead of exponentially (in logarithmic stepsn; 2n; 4n; . . . ; 2dlg men) as shown in the theoretic analysis (see Fig. 7). This will be clear when we show the empirical results in Section IV.

ForK-segmentation design (K 2), all we need to do is to split each segment intoK sections of equal length right after the aforemen-tioned procedures. However, since the minimum length of a segment is one, it is impossible to partition an interval of length smaller than

TABLE I

ONE-SEGMENTROUTINGRESULTS(L = 100; T = 36; D = 12; K = 1)

TABLE II

TWO-SEGMENTROUTINGRESULTS(L = 100; T = 36; D = 12; K = 2)

2K 0 1 into K segments. Specifically, we can partition an interval of lengthl into at most dl=2e segments.

IV. EXPERIMENTALRESULTS

We implemented our segmentation design algorithms in the C++ programming language on a personal computer with a Pentium 166 microprocessor and 32-MB random access memory. The weighted bi-partite matching code was adopted from the public LEDA package. The routability of the architectures designed by our algorithms was tested using the one-segment and the multisegment routing algorithms by Zhu et al. [13]. In addition to the notation mentioned in Section II, the fol-lowing notation is also needed to explain our experimental procedures.

D Maximum number of net terminals at a column. f(l) Probability that a net is of length l.

The input routing instances were generated by the programs used in [8] and [13]. The first set of ten routing instances is based on the parametersL = 100, T = 36, and D = 12, which are close to the row-based architectures used by Actel FPGAs [1]. DistributionsDi; i = 1; 2; . . . ; 7 are defined as follows. If f(l) = (p1; p2; p3; p4; p5), then the probability that a net has length between0:2(j 0 1)L and 0:2jL is equal topj= _1k5pk. “Ge,” “No,” and “Po” are geometric, normal, and Poisson distributions, respectively. For each net distribution, 300 routing instances were generated.

The ratio of routing success was measured by the threshold density dT defined in [13].dT means threshold for the largest channel den-sities in one distribution such that larger than 90% routing instances in the distribution with channel densitydT can be successfully routed. Obviously, a largerdT is more better.

Tables I and II list the respective comparisons for one- and two-seg-ment routing between our designs and those in Zhu et al. [13] based on the parametersL = 100, T = 36, and D = 36, which were used

(7)

TABLE III

TWO-SEGMENTROUTINGRESULTS(L = 20; T = 18; D = 6; K = 2).

TABLE IV

THREE-SEGMENTROUTINGRESULTS(L = 50; T = 24; D = 8; K = 3)

Fig. 8. Channel segmentation designed by our algorithm (L = 100; T =

36; D = 12; K = 1, distribution D ).

in [13]. The results show that our designs outperform those in [13] by averages of 24% and 12.9% improvements in routability for one- and two-segment routing, respectively.

The parameters used for net distributions in [8] wereL = 20, T = 18, and D = 6 for two-segment design and L = 50, T = 24, and D = 8 for three-segment design. The results, reported in Tables III and IV, show that our method significantly outperforms the previous work in [8] and [13]. Our designs achieve averages of 17.9% and 8.9% improvements in routability compared with the work in [13] and the most recent work in [8], respectively. Fig. 8 shows our one-segment channel segmentation design for distributionD1using the parameters L = 100, T = 36, D = 12, and K = 1.

We also performed experiments to explore the effect of applying dif-ferent pairings in the matching-and-merging stage. Table V lists the results for the two-segment design. In the original experiments, we used the pairings(R₁; R₂); (R₃; R₄); . . . Here, we tested the pairings (R2; R3); (R4; R5); . . .. Other procedures remain the same. The or-dering of merging sequence may affect the routability of our design;

TABLE V

ROUTABILITYCOMPARISONS OFMERGINGROUTINGINSTANCES WITH

DIFFERENTPAIRINGS FORTWO-SEGMENTROUTING

Fig. 9. Average number of nets in one instance at each iteration.

as shown in Table V, there is an average variation of 5% in individual dT’s by using the two pairing schemes. However, the averagedTvalues for the ten distributions remain about the same for the two schemes. The results show that no matter what pairing scheme is applied, our matching-based approach performs well and the overall routability of the designs is quite stable across different pairing schemes.

Our design algorithms are quite efficient. The empirical runtimes for the largest set of designs (L = 100, T = 36, D = 12, and K = 2) ranged from 10 s for distributionD2to 78 s for distributionD7(with an average runtime of about 30 s). Although the theoretic analysis gives O(m3_n3_{)-time complexity for our algorithm, the empirical runtime} for them term is close to O(m lg m) instead of O(m3). The reason is that most of the nets in two input routing instances were merged to-gether. Therefore, the number of nets in a merged instance grew only linearly instead of exponentially. In Fig. 9, the average number of nets per routing instance is plotted as a function of the number of iterations (in the matching-and-merging stage) for each of the ten distributions listed in Table I. The curves in Fig. 9 exhibit the linearity of the empir-ical growth rates for the average number of nets per routing instance.

V. CONCLUSION

We have presented a new direction for studying FPGA segmenta-tion design based on the graph-matching formulasegmenta-tion. Different from

(8)

the previous work that works on some sort of approximation (e.g., to fit into particular distribution functions) in the beginning, our method targets at the optimal architecture directly (e.g., the setIF of segments resulted from the matching-and-merging stage can cover each input routing instance for one-segment designs—100% routing success rate if no area constraint). We believe that targeting at the optimal architec-ture directly (ours) is the major factor that leads to substantially better performance than working on the approximation (previous work) in the beginning.

We have shown that the matching-based approach is effective and ef-ficient for the channel segmentation design. In particular, the approach is also very flexible, which makes it a promising alternative to more complex segmentation designs. Future work lies in the extension to higher order (e.g., two- and three-dimensional) segmentation designs. Also, as shown in the experimental results, the pairing of input routing instances in the matching-and-merging stage has some impact on the quality of the channel segmentation design. To explore the best pairing scheme, we propose to apply a general weighted graph-matching algo-rithm to find the minimum cost pairing among given routing instances.

ACKNOWLEDGMENT

The authors would like to thank Dr. K. Zhu and Prof. W. K. Mak for providing packages for segmentation designs and the anonymous reviewers for their constructive comments.

REFERENCES

[1] FPGA Data Book and Design Guide, Actel Corp., Sunnyvale, CA, 1996. [2] B. Fallah and J. Rose, “Timing-driven routing segment assignment in FPGAs,” in Proc. Can. Conf. VLSI, Halifax, NS, Canada, Oct. 1992, pp. 18–20.

[3] Y.-W. Chang, J.-M. Lin, and D. F. Wong, “Graph matching-based al-gorithms for FPGA segmentation design,” Proc. IEEE/ACM Int. Conf.

Computer-Aided Design, pp. 34–39, Nov. 1998.

[4] A. El Gamal, J. Greene, J. Reyneri, E. Rogoyski, K. El-Ayat, and A. Mohsen, “An architecture for electrically configurable gate arrays,”

IEEE J. Solid-State Circuits, vol. 24, pp. 394–398, Apr. 1989.

[5] A. El Gamal, J. Greene, and V. Roychowdhury, “Segmented channel routing is nearly as efficient as channel routing (and just as hard),” in Proc. Advanced Research VLSI, Santa Cruz, CA, Mar. 1991, pp. 193–221.

[6] M. Khellah, S. Brown, and Z. Vranesic, “Modeling routing delays in SRAM-based FPGAs,” in Proc. Canadian Conf. VLSI, Banff, Alberta, Canada, Nov. 1993, pp. 14–16.

[7] M. J. Lorenzetti and D. S. Baeder, “Routing,” in Routing in Physical

Design Automation of VLSI Systems, B. Preas and M. Lorenzetti,

Eds. Redwood City, CA: Benjamin Cummings, 1988, ch. 5. [8] W. K. Mak and D. F. Wong, “Channel segmentation design for

symmet-rical FPGAs,” Proc. IEEE Conf. Computer Design, pp. 496–501, Oct. 1997.

[9] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization:

Al-gorithms and Complexity. Englewood Cliffs, NJ: Prentice-Hall, 1982. [10] M. Pedram, B. S. Nobandegani, and B. T. Preas, “Design and analysis of segmented routing channels for row-based FPGAs,” IEEE Trans. on

Computer Aided Design, vol. 13, no. 12, pp. 1470–1479, Dec. 1994.

[11] J. Rose and D. Hill, “Architectural and physical design challenges for one-million gate FPGA’s and beyond,” in ACM/SIGDA Int. Symp.

Field-Programmable Gate Arrays, Monterey, CA, Feb. 1997, pp. 129–132.

[12] K. Roy and M. Mehendale, “Optimization of channel segmentation for channelled architecture FPGAs,” Proc. IEEE Custom Intergrated

Cir-cuits Conf., pp. 4.4.1–4.4.4, May 1992.

[13] K. Zhu and D. F. Wong, “Segmented channel segmentation design for row-based FPGAs,” Proc. IEEE/ACM Int. Conf. Computer-Aided

De-sign, pp. 26–29, Nov. 1992.

On Diagnosis and Diagnostic Test Generation for Pattern-Dependent Transition Faults

Irith Pomeranz and Sudhakar M. Reddy

Abstract—We propose a method of modeling pattern dependence as

part of the existing delay fault models without incurring the complexity of considering physical effects that cause pattern dependence. We apply the method to transition faults. We define the conditions under which two pattern-dependent transition faults can be said to be distinguished by a given test set. We provide experimental results to demonstrate the diagnostic resolutions obtained under the proposed model. We also present conditions for identifying pairs of indistinguishable pattern-dependent transition faults and propose a procedure for generating diagnostic tests for distinguishable pattern-dependent transition faults.

Index Terms—fault diagnosis, pattern-dependent delay defects,

transi-tion faults.

I. INTRODUCTION

Diagnosis of delay faults is important for identifying and possibly correcting timing-related errors in the design and manufacturing pro-cesses of a high-performance chip. Diagnosis of delay faults of various types was considered in [1]–[4]. Path delay faults were considered in [1] and [4] and gate delay faults were considered in [2]. Delays re-sulting from process variations were considered in [3]. In these works, delay faults are assumed to be independent of the input patterns applied to the circuit. Thus, if two different tests detect the same fault, faulty behavior is expected under both tests in the presence of the fault. In [5]–[7], it was shown that the delays throughout a circuit may be pattern dependent. This is because certain signal transitions may be speeded up or delayed depending on the values of other lines in the circuit. Pattern dependence of delays in the emerging technology of silicon on insulator is considered one of the challenges in using this technology [8]. As a result of delays being pattern dependent, delay defects may show pat-tern-dependent behavior as well. This implies that two tests detecting the same delay defect may not both result in faulty output values in the presence of the defect. Consequently, the following situation may occur.

Consider two different tests that detect the same fault. Consider a defect that has the same effect as the fault (i.e., it delays the same tran-sitions by the same amount as the fault). However, it exhibits pattern dependence. Assuming that the defect is present in the circuit, it is pos-sible that because of values of other lines in the circuit, the transitions on the faulty lines would occur on time under the first test, resulting in fault-free output values while the same transitions may be delayed under the second test, resulting in faulty output values.

The difficulty in working with pattern-dependent defects results from the fact that the physical effects causing the pattern dependence are complex. Consequently, fault diagnosis, which takes all these effects into account, may be complex if not impossible because of the inability to model accurately chip behavior in the presence of defects.

Manuscript received May 22, 2000; revised November 15, 2000. This work was supported in part by the National Science Foundation under Grant MIP-9725053 and in part by the Semiconductor Research Corporation under Grant 98-TJ-645. This paper was presented in part at the 37th Design Automation Conference, Los Angeles, CA, June 2000. This paper was recommended by Associate Editor R. Aitken.

I. Pomeranz is with the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907 USA.

S. M. Reddy is with the Electrical and Computer Engineering Department, University of Iowa, Iowa City, IA 52242 USA.