Timing modeling and optimization under the transmission line model

(1)

Timing Modeling and Optimization Under the

Transmission Line Model

Tai-Chen Chen, Song-Ra Pan, and Yao-Wen Chang, Member, IEEE

Abstract—As the operating frequency increases to gigahertz

and the rise time of a signal is less than or comparable to the time-of-flight delay of a wire, it is necessary to consider the trans-mission line behavior for delay computation. We present in this paper, an analytical formula for the delay computation under the transmission line model. Extensive simulations with SPICE show the high fidelity of the formula. Compared with previous works, our model leads to smaller average errors in delay estimation. Based on this formula, we show the property that the minimum delay for a transmission line with reflection occurs when the number of round trips is minimized (i.e., equals one). Besides, we show that the delay of a circuit path is a posynomial function in wire and buffer sizes, implying that a local optimum is equal to the global optimum. Thus, we can apply any efficient search algorithm such as the well-known gradient search procedure to compute the globally optimal solution. Experimental results show that simultaneous wire and buffer sizing is very effective for performance optimization under the transmission line model.

Index Terms—Buffer sizing, delay model, inductance,

intercon-nect, performance optimization, transmission line, wire sizing.

NOTATION

We use the following notation in this paper. Resistance of a gate with unit size. Resistance of gate .

Capacitance of a gate with unit size. Capacitance of gate .

Size of gate .

Capacitance of a wire with unit size. Inductance of a wire with unit size. Sheet resistance of a wire.

Width of wire . Length of wire .

Characteristic impedance of wire . Propagation velocity of wire . Driver resistance.

Load capacitance.

High voltage of power supply.

Minimum voltage at the input of a logic gate re-quired so that gate switches.

Manuscript received April 7, 2002; revised January 30, 2003. This work was supported in part by the National Science Council of Taiwan ROC under Grant NSC-91-2215-E-002-036.

T.-C. Chen is with the Graduate Institute of Electronics Engineering, National Taiwan University, Taipei 106, Taiwan (e-mail: d0943008@ee.ntu.edu.tw).

S.-R. Pan is with the Department of Electronic and Computer Engi-neering, University of California at Santa Barbara, CA 93106 USA (e-Mail: srpan@ece.ucsb.edu).

Y.-W. Chang is with the Department of Electrical Engineering and the Grad-uate Institute of Electronics Engineering, National Taiwan University, Taipei 106, Taiwan (e-mail: ywchang@cc.ee.ntu.edu.tw).

Digital Object Identifier 10.1109/TVLSI.2003.820529

Transmission coefficient at point if a signal is transmitted from point to point .

Reflection coefficient at point if a reflection travels from point to point .

Voltage attenuation coefficient on wire if a signal is transmitted from its source to sink.

I. INTRODUCTION

A

S THE operating frequency increases to gigahertz, the rise time of a signal is less than or comparable to the time-of-flight delay of a wire. Also, the die size is getting larger, re-sulting in longer global interconnection lines. The trends make it important to consider the transmission line behavior for delay computation [1]. Transmission line effects become significant when , where is the rise time and is the time of flight determined by the wire length divided by the velocity [1]. There are two kinds of transmission lines. A wire with negligible resistance is called a lossless transmission line. How-ever, on-chip interconnections have significant resistance, and they should be treated as lossy transmission lines [1], [6], [18]. Obviously, it is more accurate and desirable to consider line re-sistance for timing estimation and optimization. In this paper, therefore, we shall focus on lossy transmission lines.

When two transmission lines on a chip are connected and these two wires have different characteristic impedance, such mismatches of wire impedance can cause reflections at the junc-tion point [1], [13]. Since reflecjunc-tions may cause logic failure or increase delay, the discontinuities of impedance at junction points must be controlled in order to minimize the side effect of reflections. On one hand, if the driving resistance is larger than the wire impedance, it requires multiple trips (a trip is a signal travels from one end of a line to the other end) to switch on the load; on the other hand, if the driving resistance is smaller than the wire impedance, the load may be falsely triggered. We can eliminate the reflections by matching the driving resistance and the wire impedance. The driving resistance of a gate and the impedance of a wire are approximately in inverse proportion to its size and width, respectively. Hence, wire and gate sizing can affect the delay, implying that sizing circuit components (wires and buffers) is applicable to delay optimization.

A. Previous Work

Timing is a crucial concern in high-performance circuits. Many techniques such as wire sizing and gate sizing have been proposed to optimize timing (e.g., [3]–[5], [12], etc.); however, most of the techniques are based on the Elmore delay model [8]. Modeling and analysis techniques for simulation and

(2)

Fig. 1. A gate is the loading of its upstream, but is the driver of its downstream. A lossy transmission line is represented by a serial sections of its resistance, inductance, and capacitance, or we can merge each section of inductance and capacitance into a characteristic impedance.

timing optimization under the lossy transmission line model have been studied extensively in the literature [9]–[11], [14], [16], [17], [20], [23], [25]–[29]. Previous works in [16] and [20] proposed precise methods for simulating waveform, but they did not present any delay estimator. The works in [17] and [27] modeled the transmission line effect; however, they did not consider delay optimization. Several works in the literature consider the minimization of delay under the transmission line model. Gao and Wong in [9] and [10] applied continuous wire sizing to minimize delay under the lossy transmission line model; however, they focused on exponentially tapered wires. Ismail and Friedman in [11], computed a uniform buffer size and the number of buffers to optimize the delay of a circuit path under the lossy transmission line model; however, their formula does not handle wire sizing. Lin and Pileggi in [14], proposed a wire-sizing formulation with second-order central moments, but their wire-sizing formulation under the transmission line model is not always a posynomial program and, thus, there is no optimality guarantee. The works in [25] and [29] adopted the S-parameter macro delay model to minimize delay and skew, but the sensitivities were computed at each step using finite difference approximation, which requires expensive computation. The works in [23], [26], and [28] adopted higher order moments to minimize delay, but their delay models were computationally expensive.

B. Our Contribution

In this paper, we focus on delay modeling, and timing opti-mization under the transmission line model. Unlike most pre-vious works that are based on relatively complicated models (e.g., [16], [17], [26], and [27]) or incur larger errors (e.g., [8] and [11]), we present a simple, yet accurate formula for the delay computation under the lossy transmission line model. Ex-tensive simulations with SPICE show that the formula has high fidelity, with an average error of within 5.61% for lossy trans-mission lines. Based on this formula, we show the property that the minimum delay for a lossy transmission line with reflection occurs when the number of round trips is minimized (i.e., equals one). Besides, we show that the delay of a circuit path is a posyn-omial function in wire and buffer sizes, implying that a local optimum is equal to the global optimum. Thus, we can apply any efficient search algorithm, such as the well-known gradient search procedure, to compute the optimal wire and buffer sizes for timing optimization for a circuit path. For a routing tree (a routing tree is a tree that interconnects all signal terminals of a net), we propose a two-stage algorithm to optimize the delay. In the first stage, we traverse the tree to determine its critical path and delay. In the second stage, we control the reflections at all

branching points to prevent from falsely triggering receivers and minimize the critical path delay. We repeat the two stages until there are no further improvements in the delay of the tree. Ex-perimental results show that simultaneous wire and buffer sizing is very effective in minimizing the delays of circuit paths under the transmission line model.

The remainder of this paper is organized as follows. Section II gives the gate and the transmission line models. Section III for-mulates the problem. Section IV considers the simultaneous wire and buffer sizing for delay optimization. Section V extends the cases on a general routing tree. Section VI shows the ex-perimental results, and finally, concluding remarks are given in Section VII.

II. TRANSMISSIONLINEMODEL

In this Section, we give the wire and gate models and dis-cuss the transmission line effects which is importance when , where is the rise time and is the time of flight determined by the wire length divided by the velocity [1]. A. Gate and Wire Modeling

Fig. 1 illustrates the gate and the lossy transmission line models used in this paper. For a gate with size , the gate

resistance is and the gate capacitance is ,

where and are the unit-sized resistance and unit-sized capacitance of a gate, respectively.

A uniform lossy transmission line of width can be rep-resented by a serial sections of unit-length resistance, , unit-length inductance, , and unit-length capacitance, , where , , and are the sheet resistance, the unit-sized inductance, and the unit-sized capacitance of a wire, respectively. The effect of inductance and capacitance can be represented by a characteristic impedance, , which equals . The propagation velocity of a wire , , equals [1]. If the length of a wire is , its total resistance, total inductance, and total

capacitance are , , and , respectively.

Therefore, with the gate and the lossy transmission line models, we can represent a circuit path by resistors, capacitors, and characteristic impedance. Fig. 2 illustrates the resulting circuit modeling for a circuit path with buffers, where and are the driver resistance and the load capacitance, respectively. B. Reflections on a Wire

Due to the inductive and capacitive discontinuities, the re-sulting reflections may cause logic failure or excessively longer delay [1], [13]. As shown in Fig. 3, gate drives lossy trans-mission line and gate . Inductive and capacitive

(3)

discontinu-Fig. 2. A circuit path (with lossy transmission lines) is a combination of resistors, capacitors, and characteristic impedances.

Fig. 3. The resistor with resistancer drives a lossy transmission line with characteristic impedanceZ and a capacitor with capacitance c .

ities may occur at the points and . The initial voltage at the point is the sum of the signal sent out from the point and the reflection generated at the point . When the reflection generated at the point travels backward to the point , a new reflection generated at the point is transmitted toward to the point . The new voltage at the point is the sum of the in-coming reflection, the new outgoing reflection, and the initial voltage.

On the one hand, as shown in Fig. 4(a), if the resistance of the gate , , is larger than the impedance of the wire , , the initial voltage at point might not reach the threshold voltage. Thus, multiple round trips along the wire may be required to correctly transmit a signal. On the other hand, as shown in Fig. 4(b), if is smaller than , a reflection generated at point is negative since the reflection coefficient is negative. Therefore, the voltage may oscillate at the point , causing overshoot or under-shoot. This oscillating pattern is called ringing. If matches with , the source reflection coefficient . Thus, no reflections are generated at the source end .

C. Voltage Attenuation on a Wire

In a lossy transmission line, the resistance of a wire causes voltage attenuation, and the voltage attenuation coefficient along a lossy transmission line is derived in [1] as follows:

(1) Therefore, in Fig. 3, the voltage at the point before reflection is given by

(2) D. When to use Transmission Line Analysis

According to [1], [15], and [21], the transmission line be-havior is significant when

(3) and

(4)

where is the rise time of wire ,

is the time-of-flight delay, is the total

Fig. 4. (a) Multiple round trips are required to correctly transmit a signal. (b) Ringing may cause logic failures.

resistance, and is the characteristic impedance. As illustrated in Fig. 3, we can rewrite (3) and (4) as Inequalities (5) and (6) as follows:

(5) and

(6) Besides, to make the voltage at the point correctly drive the gate , the voltage at the point after infinite reflections should be greater than or equal to . In other words, the following inequality must be satisfied.

(7) where

Therefore, we should model a wire as a lossy transmission line if Inequalities (5)–(7) are satisfied; it should be modeled as a distributed RC line, otherwise.

(4)

TABLE I

RC PARAMETERS OF THE0.13-m TECHNOLOGY INSIA’99

Note that, (5) can be reduced as follows by discarding :

(8) Since , ringing occurs [see Fig. 4(b)]. If ringing oc-curs, we may need to model a wire as a transmission line; it should be modeled as a distributed RC line, otherwise.

E. Delay Model

In this Section, we introduce our delay model. Our delay model is based on the RC model. However, since the resis-tive loss causes voltage attenuation on a wire and the discon-tinuity of impedance at a junction point incurs reflection, we need to modify the original RC model. First, as given in (1), the voltage attenuation coefficient along a lossy transmis-sion line is less than 1 because the exponent of is negative. Therefore, the effective resistance is not equal to the total resis-tance of the wire, implying that the pull-up resisresis-tance needs to be modified. Second, due to voltage attenuation and reflection, the final voltage may not equal , implying that we need to use an approximate method to correct the delay model. Third, due to reflection, multiple round trips may be required to correctly transmit a signal, implying that the number of round trips be considered in delay model. We describe in detail how to modify the original RC model in the following.

The time for charging the capacitive load (defined at 50% of the final value) of the lumped network equals , where is the pull-up resistance and is the total capacitive load [18], [19], [24]. According to [1], the current that a lossless transmission line can supply is limited by its characteristic impedance. As a result, looking from the receiving end, the line behaves like a resistor with a value . In a lossy transmission line, not only its characteristic impedance, but also its effective resistance supplies the current. If the total resistance of a wire causes voltage attenuation, the voltage at the receiving end becomes zero and the effective resistance equals the total resistance. In Section II-C, we know that the voltage at the receiving end equals . This implies that there is only percentage of the total resistance for the line between

nodes and , , causing voltage attenuation and

supplying current.

Consequently, the pull-up resistance for the transmission line is equal to the sum of the characteristic impedance of the line, and partial resistance of the wire which causes voltage attenuation. We have the pull-up resistance for the line as follows:

(9)

Hence, the time for charging the capacitive load (at 50% of the voltage of the first overshoot) of a transmission line is given by

(10)

With , , and the

effect of reflection, the voltage of the first overshoot, , at

the receiving end after reflection equals ,

which may not equal . Thus, we can use an approximate method that divides by to obtain the charging time, , for which the voltage equals . Therefore, we have

(11) where

Since transmission line analysis always gives the correct an-swer independent of the rise time of the driver, delay is the sum of the time-of-flight along the wire and the time for charging the capacitive load [1], [18]. Thus, the propagation

delay from the gate to the next gate in Fig. 3

is given by

(12) where is the number of required round trips to correctly transmit a signal.

F. Accuracy

We used SPICE to verify the accuracy of our delay model. The experiments were performed on a signal wire with no buffers.

The parameters we used are listed in Table I, where , , , , , , , and are the unit capacitance of a wire, the unit inductance of a wire, the sheet resistance of a wire, the unit capacitance and resistance of a gate, the area of minimum-size buffer, the driver resistance, and the load capacitance, respec-tively. This set of parameters is based on the 0.13- m tech-nology of the SIA’99 roadmap [22].

In the first and second experiments, we used fixed wire lengths (2.5 and 5 mm) with a variety of wire widths. The wire widths for all experiments satisfy (5)–(7). Therefore, the wire widths ranged from 130 to 480 nm for the first experiment, and ranged from 130 to 530 nm for the second experiment. In

(5)

Fig. 5. Comparison of the delays calculated by SPICE, Elmore, I&F, and our delay models for lossy transmission lines; (a) wire length= 2.5 mm; (b) wire length = 5 mm.

TABLE II

EXPERIMENTALRESULTS FOR THEACCURACY OFELMORE, I&F,AND OURDELAYMODELS FORLOSSYTRANSMISSIONLINES; WIRELENGTH= 2.5 mm

TABLE III

EXPERIMENTALRESULTS FOR THEACCURACY OFELMORE, I&F,AND OURDELAYMODELS FORLOSSYTRANSMISSIONLINES; WIRELENGTH= 5 mm

Fig. 5, the delays are plotted as functions of the wire widths for SPICE, Elmore, I&F, and our delay models, where I&F denotes the delay model presented in [11]. Tables II and III show the experimental results, where width denotes the wire width, SPICE denotes the delay calculated by SPICE, Elmore denotes the delay calculated by the Elmore delay model, denotes the percentage of the error between SPICE and the Elmore delay model, denotes the delay calculated by the I&F delay model, denotes the percentage of the error between SPICE and the I&F delay models, Ours denotes the delay calculated by our delay model, and denotes the percentage of the error between SPICE and our delay models. The percentage of the error is calculated by , where denotes Elmore, I&F, or Ours. Compared to SPICE and based on the lossy transmission line of 2.5 mm (5 mm) long, the maximum error calculated by the Elmore delay model is

and the average error is 29.20% (12.05%), the maximum error calculated by the I&F delay model is (11.34%) and the average error is 2.98% (4.70%), and the maximum error

calculated by our delay model is 6.58% (12.38%) and the average error is 3.80% (6.22%).

In the third and fourth experiments, we used fixed wire widths (500 and 130 nm) with a variety of wire lengths. As mentioned earlier, the wire lengths for all experiments satisfy (5)–(7). Therefore, the wire lengths ranged from 3.7 to 6.2 mm for the third experiment, and ranged from 0.82 to 7 mm for the fourth experiment. In Fig. 6, the delays are plotted as functions of the wire lengths for SPICE, Elmore, I&F, and our delay models. Tables IV and V show the experimental results, where Length denotes the wire length. Compared to SPICE and based on the lossy transmission line of 500 nm (130 nm) wide, the maximum error calculated by the Elmore delay model is 10.01 % ( 51.74 %) and the average error is 4.11% (30.55%), the maximum error calculated by the I&F delay model is 10.99% ( 14.19 %) and the average error is 9.52% (5.95%), and the maximum error calculated by our delay model is 1.99% (20.31%) and the average error is 1.49% (10.94%).

According to the above four experiments, the average error of our delay model is 5.61%. Besides, based on the

(6)

observa-Fig. 6. Comparison of the delays calculated by SPICE, Elmore, I&F, and our delay models for lossy transmission lines; (a) wire width= 500 mm; (b) wire width = 130 mm.

TABLE IV

EXPERIMENTALRESULTS FOR THEACCURACY OFSPICE, ELMORE, I&F,AND OURDELAYMODELS FORLOSSYTRANSMISSIONLINES; WIREWIDTH= 500 nm

TABLE V

EXPERIMENTALRESULTS FOR THEACCURACY OFSPICE, ELMORE, I&F,AND OURDELAYMODELS FORLOSSYTRANSMISSIONLINES; WIREWIDTH= 130 nm

tion from the simulations, the delays computed from our model are upper bounds of those obtained by SPICE, which makes our model a reliable delay estimator under the lossy transmission line model. The Elmore delay model, however, has a signifi-cant negative percentage of errors. Therefore, the Elmore delay model is not a suitable delay estimator for the lossy transmission line model. Also, the I&F delay model incurs positive as well as negative errors for different wire widths of the same length. Hence, although the I&F delay model may be more accurate in some corner cases, it is less suitable for delay estimation under the lossy transmission line model when we apply wire sizing to optimize a circuit. Often circuit designers prefer overestimating delay to underestimate, since an over-optimistic estimation of delay may lead to timing violations. Therefore, our delay model should be more suitable than the Elmore and I&F delay models for practical applications. Notice that the maximum inaccuracy of our delay model occurs at the minimum wire size and the maximum wire length. The reason for this phenomenon is that the total resistance is comparable to the impedance. According to Section II-D, the transmission line behavior is insignificant for this situation.

III. PROBLEMFORMULATION

This paper targets at minimizing delay by sizing circuit com-ponents. We formulate this problem as follows:

• Input: A circuit path and the lower and upper bounds for wire and buffer sizes.

• Objective: Determine the optimal wire and buffer sizes for each segment in a circuit path, so that delay is minimized. We will reformulate this problem for a routing tree in Section V.

IV. OPTIMALWIRE ANDBUFFERSIZING FOR APATH A. Reflection Considerations

In practice, designers typically desire to optimize perfor-mance without generating undesirable reflections and transmit a signal correctly within a limited number of round trips. As the VLSI technology advances, the wire length is increasing and the capacitance of a gate is decreasing, making the time-of-flight delay dominate the delay. Therefore, we have the

(7)

following theorem for the optimal number of round trips for delay optimization.

Theorem 1: Considering reflections, the minimum delay based on our model for a circuit section occurs when the number of round trips equals one.

Proof: With the gate and the wire models described in Section II-A, we can divide a circuit path into sections, and the sections can be handled one by one. Consider a section shown in Fig. 3. Suppose on the contrary that the minimum delay for a circuit section occurs when the number of round trips is larger than one. Let and be the values that result in a (local) minimal delay for the circuit section when the number of round trips equals one, and and be the values that result in the globally minimum delay (the number of round trips is larger than one). According to (12), we have the following:

(13) and

(14) where

is the (local) minimal delay when the number of round trips equals one, is the globally minimum delay, and is the

number of round trips. Here, . Since and a wire

may need to be modeled as a transmission line if ringing occurs (see Section II-D), the first undershoot for is smaller than , implying that the first undershoot for is smaller that for . The first undershoot can be calculated as follows:

(15)

where or , , and . On one

hand, if increases as increases, , implying that . On the other hand, if decreases as

increases, , implying that . To show

that , we need to discuss the following two cases:

Case 1) :

1) and : The first and

second terms of (14) are always larger than

those of (13). Thus, . 2) and : Subtracting (14) from (13), we have (16) By (7), we have (17)

When , can be as larger as

. By (8) and (17), we have the range of as follows:

(18) Therefore, by (18), (16) can be rewritten as follows:

(19) According to (6), the minimum of the right-hand side of (19) occurs when , resulting in the minimum value 0.12 .

Thus, we have .

3) and : According to Case

1.1 and Case 1.2, .

4) and : Let

and , where , . Since

, . The first

under-shoot caused by and is the same as that

caused by and , where

and . Substituting for

and for , the second term of (13) be-comes smaller. Therefore, when the number of round trips equals one, and lead to a (local) minimal delay for the circuit sec-tion, contradicting the assumption that and give a (local) minimal delay for the cir-cuit section. Thus, the case that

and will never happen.

5) and : Let

and , where , .

Since , . Similar

to Case 1.4, and lead to the globally

minimum delay, where and

, contradicting the assumption

that and give the globally minimum

delay. Thus, the case that and

will never happen.

Case 2) :

1) and : Subtracting (14)

from (13), we have

(8)

According to (6), the minimum of the right-hand side of (20) occurs when , resulting in the minimum value 2.06 .

Thus, we have .

2) and : The first and

second terms of (14) are always larger than

those of (13). Thus, .

3) and : According to Case

2.1 and Case 2.2, .

4) and : Let

and , where , .

Since , . Similar to

Case 1.4, and lead to a (local)

min-imum delay, where and

, contradicting the assumption that and give a (local) minimal delay for the cir-cuit section. Thus, the case that

and will never happen.

5) and : Let

and , where , .

Since , . Similar

to Case 1.4, and lead to the globally

minimum delay, where and

, contradicting the assumption

that and give the globally minimum

delay. Thus, the case that and

will never happen.

Therefore, the globally minimum delay occurs when the number of round trips equals one.

According to Theorem 1, we can rewrite (12) as follows:

(21)

where

B. Optimal Wire Sizing

In this section, we minimize the delay of a circuit path by wire sizing. If all buffer sizes and locations are fixed, the delay function of a circuit path from the source to sink with

segments can be calculated as follows:

(22)

where

Notice that (22) is a posynomial function in ,

implying that the wire-sizing problem has a unique global minimum [2], [7]. Thus, we can apply any efficient search algorithm, such as the well-known gradient search procedure, to find a locally optimal solution and thus the globally optimal solution.

Theorem 2: With fixed buffer sizes and locations, the delay of a circuit path based on our model is a posynomial function in wire sizes.

C. Optimal Buffer Sizing

In this section, we minimize the delay of a circuit path by buffer sizing. If all wire sizes and buffer locations are fixed, the delay function of a circuit path from the source to sink with

segments can be calculated as follows:

(23) where

Notice that (23) is also a posynomial function in , implying that the buffer-sizing problem has a unique global minimum [2], [7]. Thus, we can apply any efficient search algorithm, such as the well-known gradient search procedure, to find a locally optimal solution and thus the globally optimal solution.

Theorem 3: With fixed wire sizes and buffer locations, the delay of a circuit path based on our model is a posynomial func-tion in buffer sizes.

D. Optimal Simultaneous Wire and Buffer Sizing

In this section, we minimize the delay of a circuit path by simultaneous wire and buffer sizing. If all buffer locations are fixed, the delay function of a circuit path from the source to

sink with segments is the

same as (23).

Notice that (23) is also a posynomial function in

, , implying that the simultaneous

wire- and buffer-sizing problem has a unique global minimum [2], [7]. Thus, we can apply any efficient search algorithm, such as the well-known gradient search procedure, to find a locally optimal solution and thus the globally optimal solution.

(9)

Fig. 7. A signal is sent out from the point 0 and then passes through the point 1 to the point 2.

Theorem 4: With fixed buffer locations, the delay of a circuit path based on our model is a posynomial function in wire and buffer sizes.

V. EXTENSIONS TOWIRE ANDBUFFER SIZING FOR AROUTINGTREE

Given a routing tree, our objective is to minimize the critical path delay under the constraints that the first undershoot at each branching point is within the same signal level, and the number of round trips required for correctly transmitting a signal from the root to each load is at most one. We formulate this problem as follows:

• Input: A routing tree and the lower and upper bounds for wire and buffer sizes.

• Output: Determine the optimal wire and buffer sizes of the tree, so that the critical path delay is minimized under the constraints that the first undershoot is within the same signal level, and the number of round trips is at most one. We shall first discuss the problem on binary routing trees, and then apply the technique to general routing trees.

A. Reflection Constraints

As shown in Fig. 7, when a signal is sent out from the source and passes through the point 1 to the point 2, a reflection may be generated at the point 2 and travels backward to the point 1. When the reflection reaches the point 1, the voltage at the point 1 will be interfered. Further, if a reflection propagated down to one load is large enough, it could cause logic failure at the load. To prevent from falsely triggering the load, the re-flection coefficient at each node must be large enough. For the example shown in Fig. 7, if the reflection coefficient at the point 1 is larger, the reflections generated at the points 2 and 3 have smaller impact on the point 1. According to [1], the reflec-tion coefficient is given by

(24)

By (24), becomes larger when and are

smaller. If becomes larger, the transmission coefficient is smaller. When a reflection generated at the point 2 travels backward to the point 1, the impact of the reflection may be negligible if is small enough. Similarly, the impact of the reflections generated at the point 3 on other points can also be negligible. For each point

Fig. 8. A signal is sent from the pointi 0 1 to the point i, and then i + 1 and i + 2. The impact of the reflections generated at the points i + 1 and i + 2 on the pointi may be negligible if and are small enough.

Fig. 9. A binary routing tree without buffers.

of a routing tree, if the reflections generated at the point have little interference at other points, a signal can be correctly transmitted from the source to the loads. In order to correctly transmit a signal from the source of a routing tree to each load, the voltage at each branching point must be larger than or equal to the threshold voltage within one round trip. As shown in Fig. 8, the following constraint must be satisfied for the point : (25) Based on (25), the initial voltage at the point will be greater than or equal to the threshold voltage when a signal from the point arrives at the point . Let denote the edge be-tween the points and , and represent the length of .

Since , , and in Fig. 8 could be different, the

reflections generated at those points will arrive at the point at different times. Without loss of generality, assume that

. The first reflection arrives at the point is sent out from the point , next is from the point , and the last is generated from the point . In order to prevent the re-flections from changing the signal level at the point , we have the following constraints:

(26)

-(27)

-(28) If all constraints are satisfied, the reflection coefficient at each point will be large enough, implying that the reflections gener-ated at the point have little interference at other points. As a result, a signal can be correctly transmitted from the source to the loads in a routing tree.

B. Delay Calculation

Given a routing tree, we number its nodes level by level, and from left to right on each level (see Fig. 7). Let , , and denote the number of edges in the tree, the set of loads, and the

(10)

TABLE VI

PARAMETERS ANDEXPERIMENTALRESULTS FOR THEACCURACY OFELMORE AND OURDELAYMODELS ON ABINARYROUTINGTREEWITHOUTBUFFERS

critical path, respectively. Similar to (12), the critical path delay of a routing tree from the source to a load is given by

(29) where

(30) (31) denotes the capacitance of node , , and denote the propagation velocity and the impedance of edge , respectively.

We used SPICE to verify the accuracy of our delay model. The experiments were performed on a binary routing tree with no buffers (as shown in Fig. 9). Table VI shows the parameters and the experimental results, where denotes the driver

resistance, and denote the length and width of

each segment, denotes the load capacitance of segments 2 and 3, denotes the delay calculated by SPICE,

denotes the delay calculated by the Elmore delay model, denotes the percentage of the error between SPICE and the Elmore delay model, denotes the delay calculated by our delay model, and denotes the percentage of the error between SPICE and our delay model. The percentage of

the error is calculated by ,

where denotes Elmore or Ours.

We propose Algorithm Find-Critical-Path (summarized in Fig. 10) to find the critical path of a routing tree . First, we de-termine the number of round trips along edge required to correctly transmit a signal (Line 2). The number of round trips is the minimum that satisfies the following constraint:

Fig. 10. The Algorithm for determining the critical path of a routing tree.

After determining the number of round trips on each edge, we label each edge with the weight (Line 3). The critical path delay is the sum of edge weights along the longest path. We then apply the depth first traversal to compute the longest path in time, where is the number of nodes (Line 4). C. General Routing Tree

We extend the technique discussed in Section IV-A and B to general routing trees. As shown in Fig. 11, assume that the point has children, and a signal is sent out from the point and then propagates down to the children of the point . Without loss of generality, assume that

. To prevent the reflections generated at the children from changing the signal level at the point , we have the following constraints: -.. .

(11)

-Fig. 11. The pointi has k children, and the signal is sent from the point i 0 1 to other points.

Fig. 12. The Algorithm for minimizing the delay of a routing tree.

Fig. 13. A routing tree with buffers.

If all constraints are satisfied, the reflection coefficient at each point will be large enough; thus, a signal can be correctly trans-mitted from the source to the loads in a general routing tree. D. Our Algorithm

Our objective is to minimize the critical path delay of a routing tree under the constraints that a signal can be correctly transmitted within one round trip and the reflection is suffi-ciently small to prevent from falsely triggering loads. Since the delay of a routing tree is dominated by the critical path delay,

our problem is to find the wire sizes that

minimize the critical path delay of a routing tree subject to the constraints listed in (25)–(28). We can apply any search algorithm such as the well-known gradient search procedure to find a solution. Algorithm Minimize-Tree-Delay computes the minimum delay of a routing tree (see Fig. 12). It consists of two stages. The first stage applies the procedure Find-Critical-Path to compute the critical path of a routing tree. The second stage applies the gradient search procedure to determine the wire sizes that minimize the critical path delay. We repeat the two stages until no improvements on the delay of the tree.

E. Simultaneous Wire and Buffer Sizing for a Routing Tree Based on the gate and wire models presented in Section II, we can divide a buffered routing tree into subtrees. In Fig. 13, the routing tree is divided into three subtrees. We can treat each subtree as a routing tree with no buffers, and then obtain the re-flection constraints for each subtree. Thus, we can minimize the delay of a buffered routing tree under the constraints that a signal

can be correctly transmitted within one round trip, and the first undershoot is controlled to prevent from changing the signal level if the reflection constraints for each subtree are satisfied.

VI. EXPERIMENTALRESULTS

We used the nonlinear programming solver, the LINGO 6.0 system, on an Intel Pentium II 400 MHz PC to compute the optimal wire and buffer sizes in a circuit path. All computations are less than 1 s. The parameters used are listed in Table I.

Given four lines of the lengths 2.5, 5, 10, and 15 mm, we inserted a specified number of buffers at equidistance. Then, we applied wire and/or buffer sizing to minimize delay. Listed in Tables VII–X, Column D1 gives the delays and areas by sizing wires and buffers simultaneously (denoted by SWBS); Column D2 (D3) gives the delays and areas by sizing wires alone (denoted by WS), with the resistance of each gate equal to 90 (60 ); and Column D4 (D5) lists the delays and areas by sizing buffers alone (denoted by BS) with the fixed wire width of 0.3 m (0.13 m), where the area is the sum of wire area (the product of width and length) and buffer area (the product of buffer size and the area of minimum-size buffer). In Fig. 14(a)–(d), the path delays are plotted as functions of the number of buffers for the five optimization techniques D1, D2, D3, D4, and D5.

As shown in Fig. 14, the ranking of those techniques for opti-mizing circuit performance, from the most effective to the least, is given by SWBS WS BS. These phenomena show the effectiveness of simultaneous wire and buffer sizing under the transmission line model. Further, the number of buffers required for performance optimization is quite small for simultaneous wire and buffer sizing. Since the delay is inversely proportional to the voltage at the receiving end, and voltage attenuation in-creases as wire length inin-creases, inserting buffers can partition a wire into sections of smaller length, which decreases the voltage attenuation and also the path delay.

VII. CONCLUSIONS

In this paper, we have presented an analytical model for com-puting the delay of a wire under the transmission line model. Ex-tensive simulations have shown the high fidelity of our model. Compared with previous works [8], [11], our model leads to smaller average errors in delay estimation. Based on our model, we have shown the property that the minimum delay for a trans-mission line with reflection occurs when the number of round trips is minimized (i.e., equals one). Besides, we have shown that the delay of a circuit path is a posynomial function in wire and buffer sizes under the transmission line model, implying that a local optimum is equal to the global optimum. Thus, we can determine the optimal wire and buffer sizes for performance optimization by applying an efficient algorithm, such as the gra-dient search procedure. Experimental results have shown the effectiveness of simultaneous wire and buffer sizing in perfor-mance optimization under the transmission line model.

(12)

TABLE VII

EXPERIMENTALRESULTS. D1: SIMULTANEOUSWIRE ANDBUFFERSIZING. D2 and D3: WIRESIZINGALONE AND THEGATERESISTANCES ARE90AND60, RESPECTIVELY. D4 and D5: BUFFERSIZINGALONE AND THEWIREWIDTHS ARE0.3AND0.13m, RESPECTIVELY. PATHLENGTH= 2.5 mm

TABLE VIII

EXPERIMENTALRESULTS. D1: SIMULTANEOUSWIRE ANDBUFFERSIZING. D2 and D3: WIRESIZINGALONE,AND THEGATERESISTANCES ARE90AND60, RESPECTIVELY. D4 and D5: BUFFERSIZINGALONE AND THEWIREWIDTHS ARE0.3AND0.13m, RESPECTIVELY. PATHLENGTH= 5 mm

TABLE IX

EXPERIMENTALRESULTS. D1: SIMULTANEOUSWIRE ANDBUFFERSIZING. D2 and D3: WIRESIZINGALONE,AND THEGATERESISTANCES ARE90AND60, RESPECTIVELY. D4 and D5: BUFFERSIZINGALONE,AND THEWIREWIDTHS ARE0.3AND0.13m, RESPECTIVELY. PATHLENGTH= 10 mm

TABLE X

EXPERIMENTALRESULTS. D1: SIMULTANEOUSWIRE ANDBUFFERSIZING. D2 and D3: WIRESIZINGALONE,AND THEGATERESISTANCES ARE90AND60, RESPECTIVELY. D4 and D5: BUFFERSIZINGALONE,AND THEWIREWIDTHS ARE0.3AND0.13m, RESPECTIVELY. PATHLENGTH= 15 mm

(13)

Fig. 14. Comparison of different optimization techniques D1: simultaneous wire and buffer sizing. D2 and D3: wire sizing alone, and the gate resistances are 90 and 60, respectively. D4 and D5: buffer sizing alone, and the wire widths are 0.3 and 0.13 m, respectively. (a) Path length = 2.5 mm. (b) Path length = 5 mm. (c) Path length= 10 mm. (d) Path length = 15 mm.

REFERENCES

[1] H. B. Bakoglu, Circuit, Interconnections and Packaging for VLSI. Reading, MA: Addison-Wesley, 1990.

[2] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear Programming:

Theory and Algorithms. New York: Wiley, 1993.

[3] C. P. Chen, Y. P. Chen, and D. F. Wong, “Optimal wire-sizing formula under the elmore delay model,” in Proc. Design Automation Conf.

(DAC), 1996, pp. 487–490.

[4] C. P. Chen, C. C. N. Chu, and D. F. Wong, “Fast and exact simultaneous gate and wire sizing by lagrangian relaxation,” in Proc. Int. Conf.

Com-puter-Aided Design (ICCAD), 1998, pp. 617–624.

[5] C. C. N. Chu and D. F. Wong, “A polynomial time optimal algorithm for simultaneous buffer and wire sizing,” in Proc. Design Automation and

Test Europe (DATE), 1998, pp. 479–485.

[6] A. Deutsch et al., “When are transmission-line effects important for on-chip interconnections?,” IEEE Trans. Microwave Theory Tech., vol. 45, pp. 1836–1846, Oct. 1997.

[7] R. J. Duffin, E. L. Peterson, and C. Zener, Geometric Programming:

Theory and Application. New York: Wiley, 1967.

[8] W. C. Elmore, “The transient response of damped linear networks with particular regard to wide band amplifiers,” J. Applied Physics, vol. 19, no. 1, 1948.

[9] Y. Gao and D. F. Wong, “Shaping a VLSI wire to minimize delay using transmission line model,” in Proc. Int. Conf. Computer-Aided Design

(ICCAD), 1998, pp. 611–616.

[10] , “Wire-sizing for delay minimization and ringing control using transmission line model,” in Proc. Design Automation and Test Europe

(DATE), 2000, pp. 512–516.

[11] Y. I. Ismail and E. G. Friedman, “Effects of inductance on the propaga-tion delay and repeater inserpropaga-tion in VLSI circuits,” IEEE Trans. VLSI

Syst., vol. 8, pp. 195–206, Apr. 2000.

[12] H. R. Jiang, J. Y. Jou, and Y. W. Chang, “Noise-constrained performance optimization by simultaneous gate and wire sizing based on lagrangian relaxation,” in Proc. Design Automation Conf. (DAC), 1999, pp. 90–95.

[13] J. Lee and E. Shragowitz, “Overshoot and undershoot control for trans-mission line interconnects,” in Proc. Electronic Components and

Tech-nology Conf., 1999, pp. 879–884.

[14] T. Lin and L. T. Pileggi, “RC(L) interconnect sizing with second order considerations via posynomial programming,” in Proc. Int. Symp.

Phys-ical Design (ISPD), 2001, pp. 16–21.

[15] F. Moll, M. Roca, and A. Rubio, “Inductance in VLSI interconnection modeling,” Inst. Elect. Eng. Circuits, Devices Systems, vol. 145, no. 3, pp. 175–179, June 1998.

[16] L. T. Pillage and R. A. Rohrer, “Asymptotic waveform evaluation for timing analysis,” IEEE Trans. Computer-Aided Design, vol. 9, pp. 352–366, Apr. 1990.

[17] R. Gupta and L. Pileggi, “Modeling lossy transmission lines using the method of characteristics,” IEEE Trans. Circuits Syst. I, vol. 43, pp. 580–582, July 1996.

[18] M. J. Rabaey, Digital Integrated Circuits: A Design

Perspec-tive. Englewood Cliffs, NJ: Prentice-Hall, 1996.

[19] J. Rebinstein, P. Penfield Jr, and M. A. Horowitz, “Signal delay in RC tree networks,” IEEE Trans. Computer-Aided Design, vol. CAD-2, pp. 202–211, July 1983.

[20] J. S. Roychowdhury, A. R. Newton, and D. O. Pederson, “Algorithms for the transient simulation of lossy interconnect,” IEEE Trans.

Computer-Aided Design, vol. 13, pp. 96–104, Jan. 1994.

[21] K. L. Shepard, D. Sitaram, and Y. Zheng, “Full-chip, three-dimensional, shapes-based RLC extraction,” in Proc. Int. Conf. Computer-Aided

De-sign (ICCAD), 2000, pp. 142–149.

[22] International Technology Roadmap for Semiconductors 1999 Edition,

Semiconductor Industry Association, 1999.

[23] Y. Sugiuchi, B. Katz, and R. A. Rohrer, “Interconnect optimization using asymptotic waveform Evaluation(AWE),” in Proc. Multi-Chip Module

Conf., 1994, pp. 120–125.

[24] W. Wolf, Modern VLSI Design: Systems on Silicon, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1996.

[25] J. S. H. Wang and W. W. M. Dai, “Optimal design of self-damped lossy transmission lines for multichip modules,” in Proc. Int. Conf.

Computer-Aided Design (ICCAD), 1994, pp. 594–598.

[26] T. Xu, E. S. Kuh, and Q. Yu, “A sensitivity-based wiresizing approach to interconnect optimization of lossy transmission line topologies,” in

Proc. Multi-Chip Module Conf., 1996, pp. 117–122.

[27] Q. Yu and E. S. Kuh, “Exact moment matching model of transmission lines and application to interconnect delay estimation,” IEEE Trans.

VLSI Syst., vol. 3, pp. 311–322, June 1995.

[28] Q. Yu, E. S. Kuh, and T. Xu, “Moment models of general transmission lines with application to interconnect analysis and optimization,” IEEE

Trans. VLSI Syst., vol. 4, pp. 477–494, Dec. 1996.

[29] Q. Zhu and W. M. Dai, “High-speed clock network sizing optimization based on distributed RC and lossy RLC interconnect models,” IEEE

(14)

Tai-Chen Chen received the B.S. and M.S. degrees in computer and information science from the National Chiao Tung University, Hsinchu, Taiwan, in 1999 and 2001, respectively. He is currently working toward the Ph.D. degree in the Graduate In-stitute of Electronics Engineering, National Taiwan University, Taipei. His current research interests include computer-aided design and interconnect optimization for deep submicron technology.

Mr. Chen received the Best Master’s Thesis Award from the National Science Council of the Republic of China in 2002.

Song-Ra Pan received the B.S. and M.S. degrees in computer and information science from National Chiao Tung University, Hsinchu, Taiwan, in 1998 and 2000, respectively. He is currently working toward the Ph.D. degree at the University of California, Santa Barbara.

From 2002 to 2003, he was with the Taiwan Semi-conductor Manufacturing Company, Ltd., Hsinchu Science- Based Industrial Park, Taiwan, R.O.C. His current research interests include computer-aided design, testing, and verification.

Yao-Wen Chang (S’94–M’96) received the B.S. degree from National Taiwan University, Taipei, in 1988, and the M.S. and the Ph.D. degrees, all in computer science, from the University of Texas at Austin in 1993 and 1996, respectively.

Currently, he is an Associate Professor in the De-partment of Electrical Engineering and the Graduate Institute of Electronics Engineering, National Taiwan University. In summer 1994, he was with the VLSI design group of IBM T. J. Watson Research Center, Yorktown Heights, New York. From 1996 to 2001, he was on the faculty of the Department of Computer and Information Science, National Chiao Tung University, Taiwan, where he received an inaugural all-uni-versity Excellent Teaching Award in 2000. His research interests include phys-ical design automation, architectures, and systems for VLSI and combinatorial optimization.

Dr. Chang received the Best Paper Award at the 1995 IEEE International Conference on Computer Design (ICCD-95) for his work on FPGA routing, reviewers’ Best Paper nominations at the 2000 ACM/IEEE Design Automation Conference (DAC) for his work on theB -tree floorplan representation, and Best Paper nomination at the 2002 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) for his work on multilevel routing. He has served on the Technical Program Committees of several international conferences on VLSI Design Automation. He is a Member of IEEE Circuits and Systems Society, Association for Computing Machinery (ACM), and ACM/SIGDA.