Multilevel Full-Chip Routing With Testability and Yield Enhancement

(1)

Multilevel Full-Chip Routing With Testability

and Yield Enhancement

Katherine Shu-Min Li, Member, IEEE, Yao-Wen Chang, Member, IEEE, Chung-Len Lee, Member, IEEE,

Chauchin Su, Member, IEEE, and Jwu E. Chen, Member, IEEE

Abstract—We propose a multilevel full-chip routing algorithm

that improves testability and diagnosability, manufacturability, and signal integrity for yield enhancement. Two major issues are addressed. 1) The oscillation ring test (ORT) and its diagnosis scheme for interconnects based on the popular IEEE Standard 1500 are integrated into the multilevel routing framework to achieve testability enhancement. We augment the traditional multilevel framework of coarsening and uncoarsening by intro-ducing a preprocessing stage that analyzes the interconnect struc-ture for better resource estimation before the coarsening stage, and a final stage after uncoarsening that improves testability to achieve 100% interconnect fault coverage and maximal diagnos-ability. 2) We present a heuristic to reduce routing congestion to optimize the multiple-fault probability, chemical–mechanical polishing- and optical proximity correction-induced manufactura-bility, and crosstalk effects, for yield improvement. Experimen-tal results on the Microelectronics Center for North Carolina benchmark circuits show that the proposed ORT method achieves 100% fault coverage and the optimal diagnosis resolution for interconnects. Further, the multilevel routing algorithm effectively balances the routing density to achieve 100% routing completion.

Index Terms—Interconnect, routing, signal integrity, yield.

I. INTRODUCTION

W

ITH ever-decreasing feature sizes and increasing chip dimensions, the integration complexity in system-on-a-chip (SOC) designs grows dramatically [1]. The high inte-gration complexity is not only caused by the huge number of transistors and interconnects fabricated in a single chip but also by the modern SOC design issues in testability, manufac-turability, and signal integrity. In particular, interconnect delay

Manuscript received December 16, 2004; revised June 20, 2005 and September 16, 2006. The work of K. S.-M. Li was supported in part by the National Science Council of Taiwan under Grant NSC 95-2218-E-110-009. The work of Y.-W. Chang was supported in part by the National Science Council of Taiwan under Grants NSC 93-2215-E-002-009, NSC 93-2215-E-002-029, and NSC 93-2752-E-002-008-PAE. The work of C. Su was supported in part by the National Science Council of Taiwan under Grants NSC 95-2221-E-009-328 MY3 and NSC 95-2221-E-009-334. This paper was recommended by Associate Editor T. Yoshimura.

K. S.-M. Li is with the Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan, R.O.C. (e-mail: [email protected]).

Y.-W. Chang is with the Department of Electrical Engineering and Graduate Institute of Electronics Engineering, National Taiwan University, Taipei 106, Taiwan, R.O.C. (e-mail: [email protected]).

C.-L. Lee is with the National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C.

C. Su is with the Department of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C.

J. E. Chen is with the Department of Electrical Engineering, National Central University, Jhungli 32001, Taiwan, R.O.C.

Digital Object Identifier 10.1109/TCAD.2007.895587

dominates the circuit performance for nanometer IC designs. Therefore, it is desirable to handle the large-scale interconnect integration considering testability, manufacturability (includ-ing process variation control, chemical–mechanical polish(includ-ing (CMP), optical proximity correction (OPC), etc.), and signal integrity issues simultaneously.

Testability and diagnosability are very important issues for interconnect design in SOC ICs. Many research works on interconnect testing can be found in the literature. Earlier works on interconnect testing were targeted for board-level testing. However, it is very difficult to apply these interconnect testing methods under the SOC environment without design-for-testability support. The popular IEEE Standard 1500 [2] provides a structural support for core testing as well as intercon-nect testing in SOC. IEEE Standard 1500 SOC test environment consists of a centralized test access mechanism (TAM) and wrappers around cores. The TAM defines the test control, whereas the wrappers provide a standardized interface for test data transmission. An interconnect oscillation ring test (IORT) [3] method was proposed to detect not only stuck-at and open faults but also delay and crosstalk glitch faults. Many testing and diagnosis problems are incurred by particular interconnect structures, which can be partly solved by a carefully designed router. Furthermore, to reduce the probability of multiple faults, it is desirable to reduce wiring congestion in a specific area. This approach is specifically important as the probability of back-end-of-line defects (i.e., high-resistance via and inter-connect defects) increases [4]. Therefore, the wire congestion is directly related to multiple fault probability and should be addressed in the routing stage.

The manufacturing process increasingly constrains physical layout design [5]. With the move to nanometer design, the semiconductor industry has seen an increase in defects and a de-crease in yield. Defects arise from sources such as lithography errors, environmental variations, and process variations, and contribute to yield loss. Since defect mechanisms are ever more dependent on design, yield must now be treated throughout the design flow. To address yield loss effectively, a systemic solution in the design flow [6] is required.

One source of process variation comes from the CMP process [7], [8]. CMP-induced variation is kept within acceptable limits by controlling local feature (interconnect) density, relative to a process-specific “window size,” to achieve global planarization for manufacturability and performance. As a result, balancing interconnect density minimizes the CMP-induced variation, and thus routing plays an important role in determining CMP variation.

(2)

Fig. 1. (a) Testability and yield enhancement in the routing stage. (b) Reducing and balancing routing congestion improves multiple fault prob-ability, CMP-induced variation, OPC effects, and crosstalk effects, all of which lead to yield enhancement.

OPC is one of the most effective methods adopted to compensate for the light diffraction effect, typically used as a postlayout process to improve manufacturability. Recently, Huang and Wong [9] proposed an algorithm that considers the OPC effect during routing by utilizing a symmetrical property. However, the process is time-consuming, and its results are still limited by the original layout quality. Again, balancing interconnect density can improve the OPC effects efficiently, since it may not always be possible to include OPC patterns in an overly congested area.

Signal integrity is also an important factor that affects yield in nanometer IC technology [1], and the crosstalk fault is a major source that contributes to the loss of signal integrity in nanometer IC [10]. The crosstalk effect is caused by coupling capacitance between adjacent wires, in which a signal change on an aggressor net interferes with the signal on a victim net. There are two types of crosstalk effects: 1) glitch and 2) delay. The glitch may induce malfunctioning in the logic values of circuit nodes, whereas the crosstalk-induced delay slows down both signals. In a heavily congested routing region, smaller wire spacing leads to larger coupling capacitance and thus greater crosstalk. Thus, reducing routing congestion also reduces crosstalk and the probability of crosstalk faults.

In this paper, we simultaneously consider the interconnect design issues of testability and diagnosability, manufacturabil-ity, and signal integrity in the routing stage for yield improve-ment [see Fig. 1(a)]. Traditionally, these issues are tackled at the postlayout stage. With the increasing design complexity, it is very difficult and even infeasible to handle those issues at the postlayout stage when most interconnect layouts are fixed and not flexible to be changed. In particular, the

design-for-manufacturability issues can all be improved through reducing and balancing the routing congestion [see Fig. 1(b)]. Therefore, it is desirable to develop a congestion-driven routing algorithm for yield improvement.

Since our goal is to enhance testability and yield in the mul-tilevel full-chip routing, we shall first review some important routing work. Traditionally, the complex routing problem is often solved by using the two-stage approach of global routing, followed by detailed routing. Global routing first partitions the routing area into tiles and decides tile-to-tile paths for all nets, whereas detailed routing assigns actual tracks and vias for nets. Many routing algorithms adopt a flat framework of finding paths for all nets. Those algorithms can be classified into sequential and concurrent approaches. Sequential rout-ing algorithms include maze-searchrout-ing approaches [11], [12] and line-searching approaches [13], which route net by net. Most concurrent algorithms apply network-flow [13] or linear-assignment formulation [14], [15] to route a set of nets at one time.

The major problem of the flat framework lies in their scal-ability for handling larger designs. As technology advances, technology nodes are getting smaller, and circuit sizes are get-ting larger. To cope with the increasing complexity, researchers proposed the use of hierarchical approaches in handling the problem by dividing a routing region into subregions and independently routing each subregion. Marek-Sadowska [15] proposed a hierarchical global router based on linear assign-ment. Chang et al. [14] applied linear assignment to develop a hierarchical concurrent global detailed router for FPGAs.

The two-level hierarchical routing framework, however, lacks information for the interactions among the subregions and is thus still insufficient in handling the dramatically growing complexity in current and future IC designs [16]. Therefore, it is desired to employ more levels of routing for very large-scale IC designs. The multilevel framework has attracted much attention in the literature recently. It employs a two-stage technique, namely, coarsening, followed by uncoarsening. The coarsening stage iteratively groups a set of circuit components (e.g., circuit nodes, cells, modules, and routing tiles) based on a predefined cost metric until the number of components being considered is smaller than a given threshold. Then, the uncoarsening stage iteratively ungroups a set of previously clustered circuit components and refines the solution by using a combinato-rial optimization technique (e.g., simulated annealing, local refinement). The multilevel framework has been successfully applied to the VLSI physical design. For example, the famous multilevel partitioners ML [17] and hMETIS [18], the multi-level placer mPL [19], and the multimulti-level floorplanner/placer MB*-tree [20] all show the promise of the multilevel frame-work for large-scale circuit partitioning, placement, and floor-planning. A framework similar to multilevel routing was presented in [21] and [22]. Lin et al. [22] and Hayashi and Tsukiyama [21] presented hybrid hierarchical global routers for multilayer VLSIs, in which both the bottom-up (coarsening) and top-down (uncoarsening) techniques were used for global routing.

Recently, Cong et al. proposed a pioneering mul-tilevel global-routing approach for large-scale full-chip

(3)

Fig. 2. Test architecture for delay and crosstalk detection and delay measurement.

routability-driven routing [16]. Cong et al. later proposed an en-hanced multilevel routing system, which is named MARS [23]. Lin and Chang proposed a novel multilevel framework for full-chip routing, which considers both routability and performance [24]. This framework integrates global routing, detailed routing, and resource estimation together at each level, leading to more accurate routing resource estimation during coarsening and thus facilitating the solution refinement during uncoarsening. Recently, Ho et al. proposed yet another multilevel framework by introducing an intermediate layer and track assignment stage between coarsening and uncoarsening to handle crosstalk minimization [26].

In this paper, we propose a multilevel full-chip routing framework considering testability and diagnosability, multiple fault probability, manufacturability, and signal integrity simul-taneously. Different from the previous works, our approach has the following distinguished features.

1) Simultaneously consider testability and diagnosability, multiple fault probability, manufacturability, and signal integrity in the multilevel routing framework.

2) Propose a new testability-driven multilevel routing frame-work, consisting of a preprocessing stage for oscillation ring test (ORT) generation for interconnect (IORT), a coarsening stage, an intermediate stage for optimization, an uncoarsening stage, and a postprocessing stage to process diagnosis patterns for interconnects (i.e., inter-connect oscillation ring diagnosis, IORD).

3) Provide testability and yield enhancement solutions in the routing stage to both diagnose interconnects and improve density flexibility.

4) Present heuristics to balance and reduce congestion in routing for yield improvement (by reducing multiple fault probability, CMP variation, OPC effects, and crosstalk). Experimental results on the Microelectronics Center for North Carolina (MCNC) benchmark circuits show that the proposed ORT method achieves 100% fault detection cover-age and the optimal diagnosis resolution for interconnects. Further, the multilevel routing algorithm effectively balances the routing density to achieve 100% routing completion. Experimental results show that our method significantly im-proves routing quality for testability and yield enhancement. Compared with [24], the experimental results show that our

router improves the maximal congestion by 1.24–6.11× in runtime speedup by 1.08–7.66× and improves the average con-gestion by 1.00–4.52× with the improved congestion deviation by 1.37–5.55×. Compared with [26], the experimental results also show that our router improves the maximal congestion by 1.54–1.84× and the average congestion by 1.17–1.34× with the congestion deviation being improved by 1.13–1.63×.

This paper is organized as follows. Section II gives pre-liminary backgrounds on ORT and diagnosis and the CMP and crosstalk models. Section III presents the integrated mul-tilevel routing framework. Experimental results are reported in Section IV, and concluding remark follows in Section V.

II. PRELIMINARIES

In this section, we give preliminary backgrounds on ORT and diagnosis, and the CMP and crosstalk models.

A. ORT Architecture for Interconnect

We first discuss the ORT for interconnects. ORT is an ef-ficient method to detect faults in SOC interconnects [3]. An oscillation ring (OR) is a closed loop of a circuit under test with an odd number of signal inversions. Once the ring is constructed during test mode, oscillation signal appears on the ring. Fig. 2 illustrates a global counter-based test architecture for both delay and crosstalk glitch detection for SOC ICs. This test architec-ture implements the IEEE Standard 1500 core test standard, in which each input/output pin of a core is attached with a wrapper cell, and a centralized TAM is provided to coordinate all test process. In addition to the normal input/output connections, all wrapper cells in a core can also be connected with a shift register, which is usually referred to as a scan path, to facilitate test access.

A modified wrapper cell design has been proposed to provide extra connections and inversion control so that the ORs can be constructed through the wires and the boundary scan paths in cores [3]. For example, the ORT architecture in Fig. 2 consists of one OR and a neighboring net, and two scan paths in cores C1and C2are part of the OR.

This test architecture can detect stuck-at, open, delay, and crosstalk glitch faults. If an OR fails to oscillate, it implies that there exists stuck-at or open fault(s) in the OR. The period of the oscillation signal can also be measured by using a delay counter in a core to test delay faults, and a similar approach can be used for crosstalk glitch detection.

The graphics will stay in the “second” column, but you can drag them to the first column. Make the graphic wider to push out any text that may try to fill in next to the graphic.

A local counter is included in each core, and a central counter is in the TAM of the chip. The central counter in the TAM is enabled by signal OscT est and triggered by the system clock. A local counter is connected to one wrapper cell in each core; however, it can be accessed by every wrapper cell through the wrapper cell chain. When an OR passes a core, an internal scan path is formed to connect the oscillation signal to the local counter. For example, consider core C1, in which

(4)

Fig. 3. Simulation waveform with process variation effects on the ORT scheme.

local counter through a series of modified wrapper cells that are configured as SI→ SO. When an oscillation test session starts (OscT est = 1), the TAM enables its own central counter as well as all local counters in cores. After the central counter in the TAM counts to a specific number n, the oscillation test session terminates and all local counters are disabled (OscT est = 0). Then, all the local counter contents can then be scanned out to automatic test equipment (ATE) for inspection.

Assume that m ORs are tested. Let the frequency of the system clock be f , and the delay counter contents of the rings be n1, n2, . . . , nm, respectively. An estimation of the ith ring’s oscillation frequency fican be approximated by

fi= f× ni/n. (1)

Since the frequency of each ring is predetermined during the design phase, a delay fault can thus be detected and measured as compared with the result of the counters.

B. Process Variation Effects on Oscillation Signals

In order to consider the process variation effect on this proposed ORT scheme, we conducted the experiments for a ring consisting of seven inverters (plus transmission gates) and 20-µm lines. The Monte Carlo simulation was conducted by changing the W/L ratio of all transistors and the R, C parameters of the nets. The mean is the nominal value, whereas the distribution is Gaussian with 3σ = 20% of the nominal value. Four hundred simulation runs were performed, and the simulation results were shown in Fig. 3, in which all oscillation signals started at time 0.

At the end of the first cycle, there is a small variation in the cycle period, and the variations are less than 1.2% of the nominal cycle period. The simulation results show that: 1) this scheme can oscillate with an odd number of inversions and 2) the process variation effects with 20% variance contribute to less than 1.2% in the frequency and oscillation period.

Fig. 4. (a) Hypernet and (b) diagnosis graph model.

C. Interconnect Model in ORT

In this section, we show the interconnect model in the ORT scheme for interconnect detection. We give a graph modeling for interconnect detection. A multiterminal net is usually mod-eled by a hypernet. The circuit structure of an SOC can be directly transformed into a hypergraph, in which each vertex denotes a pin, whereas each hypernet represents a signal net. However, this graph model is not good enough for the ORT problem, as two branches of a net should belong to two different rings, and they cannot be simultaneously tested [3]. Therefore, we consider each branch of a hypernet separately, instead of treating them as a whole. Each branch of a hypernet thus corresponds to a two-pin net, which connects the source vertex to one of its sink vertices. An n-terminal hypernet is thus broken into (n− 1) two-pin nets. The result is a normal graph G = (V, E), where E is the set of two-pin nets.

A complete test for all interconnections is thus reduced to the problem of finding a set of rings that cover all edges corresponding to the interconnection structure in the graph G. This is equivalent to finding a set of subcircuits (rings) R = {G1, G2, . . . , Gn}, such that:

1) ∀Gi, Gi⊆ G, Gi= (Vi, Ei), Giis a ring; 2) n_i=1Ei= E.

If delay fault is considered, signal delay on each net along the ring should also be considered. The period of the oscillation signal is thus the summation of the path delay on all wires and scan paths. A large delay on an interconnect wire can be detected by observing the frequency of an oscillation signal that passes the wire under test. The detection can be masked by the variation of delays on other wires in the same ring, and thus the control of process variation is crucial for the correct detection.

D. Interconnect Diagnosis With ORT

Diagnosis is the process of locating the exact fault site. The ORT can also be used for interconnect diagnosis. For intercon-nect diagnosis, the two-pin net model is also not sufficient. Consider the four-terminal net shown in Fig. 4(a), which is divided into five edge segments e1 to e5. If edge e1 is faulty,

all three rings will not oscillate correctly. A faulty e3 affects

rings 2 and 3, whereas faults on edges e2, e4, and e5 affect

rings 1, 2, and 3, respectively. For diagnosis purposes, all five segments are different.

From the aforementioned discussion, hypernets cannot be used for diagnosis. Therefore, the interconnect structure is transformed into a diagnosis graph model as follows. The scan

(5)

Fig. 5. Noise due to crosstalk-induced current.

path and wrapper cells in a core are lumped into a single terminal node, as we assume that they are fault free. The fan-out points of a hypernet form dummy intermediate nodes, and a wire segment connecting two nodes is an edge. For example, the diagnosis graph model for the hypernet in Fig. 4(a) is shown in Fig. 4(b), in which the white node is a terminal node and gray nodes are intermediate nodes. An edge is the smallest unit of a wire segment to be uniquely diagnosed. From the aforemen-tioned discussion, any stem affects all the downstream nodes and edges.

E. CMP Model

With the IC industry rapidly switching to copper inter-connects, the lack of etching techniques to remove cop-per has led manufacturers to use CMP to remove excess copper and associated barrier metals. The challenge is to optimize CMP process uniformity, which means minimiz-ing copper dishminimiz-ing and dielectric erosion and ensurminimiz-ing that structures with widely varying feature densities are pol-ished evenly. In order to improve manufacturability, the vari-ation induced by CMP should be kept within acceptable limits.

Several models for oxide planarization via CMP have been proposed in [7]. Among them, the model in [27] is neither computationally expensive nor difficult to calibrate. In this model, the interlevel dielectric (ILD) thickness z at location (x, y) is calculated as follows: z = z0− Kit ρ(x,y) , t < (ρ0z1)/K z0− z1− Kit + ρ0(x, y)z1, t > (ρ0z1)/K . (2)

In this model, the most important factor that determines the value of z is the effective pattern density ρ(x, y). In other words, we improve the variation of dielectric thickness z by keeping the effective pattern density ρ(x, y) relatively constant across the routing surface. Balancing the wiring congestion can effectively achieve this goal.

F. Crosstalk Model

In this section, we show the crosstalk model. Fig. 5 shows the noise model. The noise χ on the victim net is induced by a rising transition on the aggressor net through the coupling capacitance cc. The coupling capacitance, which is defined as follows, is proportional to the fringing capacitance cf and the

Fig. 6. (a) Partitioned layout. (b) Routing graph.

coupling length lc, and inversely proportional to the distance d between the aggressor and the victim nets

cc=

lccf

d . (3)

In (4), consider a wire e = (u, v), where u and v are two nodes in a routing tree. Let the length of the wire segment e be le, and T (v) be the subtree rooted at v. IT (v) is the total downstream current seen at v and is the current induced by aggressor nets on downstream wires of v. The current on a unit-length wire induced by aggressor nets is i0= λpc0[28],

where λ is the fixed ratio of coupling to total wire capaci-tance, p is the slope (i.e., power supply voltage over input rise time) of the aggressor net’s signal, and c0 is the unit-length

wire capacitance. In deep submicron process, a major part of the wire capacitance is attributed to the coupling capacitance if the wire spacing is kept minimum (e.g., λ = 0.7 [28]). The resulting noise χ(u, v) induced from the coupling current is the voltage pulse coupled from aggressor nets in the victim net for a wire segment e = (u, v). The induced noise can be expressed as χ(u, v) = rbIT (v)+ r0le i0le 2 + IT (v) . (4)

The crosstalk effect can be effectively reduced by increasing wire spacing, which decreases the unit-length coupling capac-itance according to (3). To achieve this goal, a router should constrain or limit the coupled number and length of adjacent wires in any area. Therefore, it is desirable to minimize the routing congestion to improve the crosstalk effects.

G. OPC-Driven Routing Mechanism

OPC offers basic corrections and a useful amount of yield improvement. The goal of OPC is to produce smaller features in an IC to enhance the “printability” of a wafer pattern. OPC applies systematic changes to photomask geometries to com-pensate for nonlinear distortions caused by optical diffraction and resist process effects. Specifically, these distortions include linewidth variations, dependent on pattern density; the varia-tions affect a device’s operation speed and line-end shortening that breaks connections to contacts. A mask incorporating OPC is thus a system that negates undesirable distortion effects during pattern transfer.

OPC works by making small changes to the IC layout that an-ticipate the distortions. To compensate for line-end shortening,

(6)

Fig. 7. Integrated multilevel routing framework.

the line is extended using a hammerhead shape that results in a line in the resist that is much closer to the original intended layout. To compensate for corner rounding, serif shapes are added to (or subtracted from) corners to produce corners in the silicon that are closer to the ideal layout.

The inclusion of OPC patterns in the interconnect structure may not be always possible when the wires are heavily con-gested, as the minimum wire spacing could be violated due to the added pattern. A balanced and less congested routing area thus increases the probability of successful OPC.

III. INTEGRATEDMULTILEVELROUTINGFRAMEWORK We propose a new integrated multilevel routing framework in this section. The router considers routability, performance, testability, diagnosability, congestion, process variation, and crosstalk simultaneously.

The ORs for test are based on circuit connectivity, and thus they can be constructed before routing. However, when delay fault is considered, the routing structure must also be considered, since the wire delay is mainly decided by the wire length. On the other hand, the diagnosis process has to consider the actual net layout, and they must be considered after the routing process.

A. Routing Model

Our global routing algorithm is based on a graph search technique guided by the congestion information associated with routing regions. The router assigns higher costs to route nets through congested areas (or those of higher delay and/or crosstalk costs) to balance the net distribution among routing regions. Before we apply the graph search technique to multi-level routing, we first model the routing architecture as a graph such that the graph topology represents the chip structure. Fig. 6 illustrates the routing graph model.

For the modeling, we first partition a chip into an array of rectangular subregions. These subregions are called global cells (GCs). A node in the graph represents a GC in the chip, and an edge denotes the boundary between two adjacent GCs. Each edge is assigned a weight/capacity according to the physical area or the number of tracks of a GC. The graph is used to

represent the routing area and is called a multilevel routing graph, which is denoted by Gk, where k is the level ID. A global router finds GC-to-GC paths for all nets on a routing graph to guide the detailed routing. The goal of global routing is to route as many nets as possible while meeting the capacity constraint of each edge and any other constraints, if specified.

As the process technology advances, multiple routing lay-ers are possible. The number of laylay-ers in a modern chip can be more than eight. Wires in each layer can run either horizontally (H) or vertically (V) in a grid style.

As illustrated in Fig. 7, G0corresponds to the routing graph

of the level 0 of the multilevel coarsening stage. At each level, our global router first finds routing paths for the local nets (or local two-pin connections) (those nets that entirely sit inside a GC). After the global routing is performed, we merge 2× 2 of GC into a larger Giand at the same time perform resource es-timation for use at the next level (i.e., level 1 here). Coarsening continues until the number of GCs at a level, for example, the kth level, is below a threshold. The uncoarsening stage tries to refine the routing solution of the unassigned segments of the level k. During uncoarsening, the unroutable nets are performed by point-to-path maze routing and rip-up and reroute to refine the routing solution. Then, we proceed to the next level (level k− 1) of uncoarsening by expanding each Gk to four finer

Gk−1’s. The process continues until we reach level 0 when the final routing solution is obtained.

B. Testability-Aware Multilevel Routing

In the coarsening stage of multilevel routing, shorter nets are routed first, and a congestion-driven heuristic is used to guide a pattern router. For all the nets that can be successfully routed, both global route and detailed route are conducted. All the nets that fail to complete will be handled at the uncoarsening stage. At the uncoarsening stage, the failed nets are routed by a global router with a different cost function to avoid heavily congested area, and a detailed maze router is used to determine the final routing paths.

In addition to the traditional multilevel framework, we in-corporate an ORT in the preprocessing stage to guide the resource estimation for interconnects and 100% fault detection coverage, an intermediate stage for interconnect optimization, and an oscillation ring diagnosis in the postprocessing stage to guarantee maximal interconnect diagnosability (see Fig. 7). C. Diagnosability-Aware Routing Structure

The minimum spanning tree (MST) topology leads to the minimum total wire length, and thus congestion is often easier to be controlled for MST than other topologies. This topology may result in longer critical paths and thus degrade circuit performance. In contrast, a shortest path tree may result in the best performance, but its total wire length (and congestion) may be significantly larger than that constructed by the MST algorithm.

The diagnosis problem also affects the routing structure. For instance, consider the four-terminal net example shown in Fig. 8. With the spanning tree connection given in Fig. 8(a),

(7)

Fig. 8. Two routing trees. (a) Spanning tree with three segments. (b) Steiner tree with the minimum number of intermediate nodes, resulting in five segments.

Fig. 9. (a) Shortest path algorithm. (b) n(v) computation.

there are three different net segments to be diagnosed. On the other hand, as the diagnosis graph model shown in Fig. 4(b), for the Steiner tree connection given in Fig. 8(b), there are two intermediate nodes (indicated by the two dotted circles) and thus five net segments to be diagnosed. In general, a spanning tree connection employed fewer wire segments to be diagnosed, and thus it is favored in our router. Our algorithm first constructs the MST structure whenever possible, which is best for diagnosability. Otherwise, it will find a routing tree with the least number of intermediate nodes.

In order to route a net with the minimum number of in-termediate branch nodes and the shortest path, we apply the algorithm shown in Fig. 9(a) for the routing tree construction. The algorithm, which is based on Dijkstra’s shortest path algorithm, finds the shortest path with the minimum number of intermediate nodes. It associates each basic detailed routing region u with two labels: d(u) and n(u), where d(u) is the distance of the shortest path from source s to u, and n(u) is the minimum number of intermediate nodes along the shortest path from s to u. Initially, d(u) =∞, n(u) = ∞, ∀u = s, d(s) = 0, and n(s) = 0. The computation of label d’s is the same as the original Dijkstra’s algorithm. The computation of n(v) is shown in Fig. 9(b), where dist(u, v) and node(u, v) are the distance and the number of intermediate nodes between nodes u and v, respectively.

D. Cost Metric for Routing Density Control

A router that incurs imbalanced routing density may degrade system performance in many ways.

1) Crosstalk effects are the results of signal coupling be-tween adjacent wires, and the coupling capacitance is usually inversely proportional to the distance between wires. In a heavily congested area, the distance

be-tween adjacent wires is small and thus the probability of crosstalk faults is increased.

2) Physical defects in a congested area may create multiple faults, which are difficult to be detected and diagnosed. 3) Process variation due to CMP effects is usually caused by

unbalanced routing congestion or density.

4) OPC pattern compensation is usually difficult and even impossible to be performed in an overly or imbalanced dense area.

Therefore, it is desirable to simultaneously balance and re-duce routing congestion/density in all areas for router design. The global routing is based on the approach used in the pattern router [25] and first routes local nets on the tiles of level 0. Let the multilevel routing graph of level i be Gi= (Vi, Ei). Let

Re={e ∈ Ei|e be the edge chosen for routing}. In order to balance the routing density, we use the following cost function α : Ei→ R to guide the routing

α(Re) = e∈Re

ce. (5)

The parameter ce is the congestion of edge and it is defined as ce= 1 2[(pe/t)−de], de< (pe/t) 1, de≥ (pe/t)

where pe and de are the capacity pe and the number of nets assigned to edge e(de), respectively. The parameter t is used to define the target level of the maximum density, and determined either by the user or by averaging over all routing areas. For example, if the goal is to make the average routing density to be half of the maximum acceptable density, then t is set to 2.

After the global routing is completed, we perform detailed routing with the guidance of the global-routing results and find a real path in the chip. Our detailed router is based on the maze-searching algorithm. Pattern routing uses an L-shaped or a Z-shaped route to make the connection, which gives the shortest path length between two points. Therefore, the wire length is minimized, and we do not include wire length in the cost function at this stage. We measure the routing congestion based on the commonly used channel density. After the detailed routing finishes routing a net, the channel density associated with an edge of a multilevel graph is updated accordingly.

If both L-shaped and Z-shaped patterns of our global router fail, we give up routing the connection, and an overflow occurs. We refer to a failed net (failed connection) as that causes an overflow. The failed nets (connections) will be reconsidered (refined) at the uncoarsening stage.

The uncoarsening stage starts to refine each local failed net (connection), left from the coarsening stage. The global router is now changed to the maze router with the following cost function β : Ei→ R

β(Re) = e∈Re

(a· ce+ b· oe) (6)

where a, b, are user-defined parameters, and oe∈ {0, 1}. If an overflow happens, oeis set to 1; otherwise, it is set to 0.

(8)

TABLE I

EXPERIMENTALRESULTSBASED ON THEMCNC BENCHMARKS FORTESTABILITYENHANCEMENT OFINTERCONNECT

DETECTION ANDDIAGNOSIS

TABLE II

ROUTINGBENCHMARKCIRCUITS

There is a tradeoff between minimizing congestion and overflow. At the uncoarsening stage, we intend to resolve the overflow in a tile. Therefore, we make b much larger than a. In addition, a detailed maze routing is performed after the global maze routing. Iterative refinement of a failed net is stopped when a route is found. Uncoarsening continues until the first level G0is reached and the final solution is found.

IV. EXPERIMENTALRESULTS

The multilevel routing system was implemented in the C programming language on a 900-MHz SUN Blade 2500 workstation with 1-GB memory. We conducted three sets of experiments: 1) testability enhancement; 2) congestion control for routing considering multiple faults, manufacturability, and crosstalk; and 3) process variation and crosstalk improvement due to congestion. Three types of benchmarks were used in our experiments: the first type is for intermodule interconnects only (see Table I); the second is the full-chip benchmarks (only mcc1 and mcc2), which include both intermodule interconnections and intramodule interconnections; and the third type contains only intramodule interconnections which are local interconnec-tions within standard-cell modules. The statistics of type-2 and -3 benchmarks are given in Table II.

A. Testability Enhancement

For testability enhancement, the experimental results of the embedded ORT scheme in the proposed multilevel routing framework are reported in Table I. We have presented both a detection (the preprocessing stage) and a diagnosis scheme (the postprocessing stage), as shown in Fig. 7, for OR-based

interconnect testing in SOC in a predetermined design flow. Thus, fmin≤ fi≤ fmax gives the timing specification for this

scheme, where fiis the estimated oscillation frequency for the

ith ring. Since our target of this ORT scheme is for intercon-nects among modules, our experiments are conducted based on the MCNC benchmark circuits with intermodule connections.

Table I gives the names of the circuits, the statistics for the circuits (the number of cores, #core; the number of pads, #pad; the number of hyperedges, #hyp; the number of two-pin nets, #2-pin), the number of rings constructed for detection, |Rt|, and the number of rings constructed for diagnosis, |Rd|. Thus, |Rt| is the testability-driven cost in the preprocessing stage, and |Rd| − |Rt| is the additional cost for the post-processing stage. In addition to the 100% fault coverage of the OR detection scheme, we also obtain 100% net segment diagnosability.

To show the feasibility of this scheme, we include the actual estimated ATE measurement time in the parentheses in Table I. Since the frequency of each ring is predetermined during the design phase, a delay fault can thus be detected and measured by inspecting the contents of the local core counters (see Fig. 2). Let the oscillation frequency of the rings, according to the timing specification, be fmin≤ fi≤ fmax, with the unit time

of measuring T0(= n/f ). Thus, we have the delay counter

contents of nmin≤ ni ≤ nmax, where nmin= fmin× T0 and nmax= fmax× T0. Let ξ be the resolution of delay

measure-ment, and ε be the maximum measurement error. Since a counter’s maximum measurement error is±1, the requirement for ε, which is defined as follows, should be the reciprocal of fmin× T0

ε = 1

fmin× T0 ≤ ζ.

(7) We show an example of the delay measurement. Let the frequency specification of the ORs be 4 to 400 MHz, and ξ is 0.001, which implies that the counter content nminis at least

1000. From (7), we have the required T0250 µs. Thus, we get

the estimated detection and diagnosis time in the parentheses. For example, for the ac3 circuit, we need 133 rings to detection and 374 rings to diagnose; therefore, 133× 250 µs = 33.25 ms for interconnect detection, and 374× 250 µs = 93.5 ms for interconnect diagnosis. This shows the effectiveness and effi-ciency of the testability enhancement.

B. Congestion Control for Multiobjective Optimization As mentioned in Section II, the CMP variation is controlled by reducing the variation of pattern density ρ (see Section II-E), whereas the signal integrity problem can be alleviated by re-ducing the routing congestion (see Section II-F). It will be clear later that our router effectively reduces both parameters (i.e., the congestion variation for CMP and the maximum congestion for the crosstalk effect).

The statistics of type-2 and -3 benchmarks are given in Table II , including the circuit size, the number of layers, the number of nets, and the number of pins. Table III reports the results for multilevel routing considering multiple faults, manufacturability, and crosstalk. We compare three different

(9)

TABLE III

COMPARISON OFROUTINGRESULTS OFMAXIMUMDENSITYWITHBOTHMAXIMUMDELAY ANDAVERAGEDELAY

TABLE IV

COMPARISON OFROUTINGRESULTS OFSTATISTICALDENSITYWITH[24]

Fig. 10. Routing density distribution for mcc1 for (a) performance-driven MR, (b) routability-driven MR, and (c) proposed algorithm. routing algorithms: 1) performance-driven MR [24];

2) routability-driven MR [24]; and 3) our proposed method (with MST routing and balanced density).

In each case, we give the maximum (critical path) delay dmax, average delay davg, and the maximum number of nets

crossing a level-0 tile #NetPEAK, which is a good estimate

for the maximum routing density. In our experiment, we set the parameter t = 4 for the ISCAS89 circuits, whereas other benchmarks are set to t = 2. The completion rate is 100% for all cases. The proposed method achieves about the same level of performance as the routability-driven method does

by up to 0.2% increase in dmax and davg, but the maximum

density is much smaller. Compared with [24], the experimen-tal results show that our router improves the maximal con-gestion (#NetPEAK) by 1.24–6.11× in runtime speedup by

1.08–7.66×.

In Table IV, we show some statistical density results. The average number of nets crossing a level-0 tile is denoted by #Netavg, and we also list those of vertical tiles and horizontal

tiles #Netavg_vand #Netavg_h, respectively. In addition, σvis denoted for the standard deviation from the vertical tile prospect and σhfor that of the horizontal tile prospect. The results show

(10)

TABLE V

COMPARISON OFROUTINGRESULTS OFSTATISTICALDENSITYWITH[26]

TABLE VI

NORMALIZEDMAXIMUMILD VARIATION

that our scheme is more effective for the full-chip benchmarks mcc1 and mcc2. For other intramodule routing, our scheme also improves the results for most cases. Compared with [24], the experimental results show that our router improves the average congestion by about 1.00–4.52×, and improves the balanced congestion (σvand σh, standard deviation respective for vertical and horizontal tiles) by 1.37–5.55×.

To demonstrate the effectiveness of the proposed algorithm in balancing the routing density, the number of horizontal wires crossing each level-0 tile for benchmark mcc1 is shown in Fig. 10 for the three algorithms. The performance-driven MR results are the least balanced routing; and the peak congestion is 181 (#NetPEAK) in mcc1. The routability-driven MR tries

to avoid congested area to improve the probability of suc-cessful routing, and thus reduces the maximum density; and its peak congestion is still 61. With our proposed algorithm, the maximum density is further reduced to 45; and thus the manufacturability effects, the probability of multiple faults, and crosstalk effects are reduced accordingly.

Mcc1 shows the maximal congestion improvement in our proposed algorithm by 1.36× compared to the routability-driven MR and by 4.02× compared to the performance-driven MR. For mcc1, our proposed algorithm improves the average congestion by 1.01–1.02× compared to the routability-driven MR and by 2.81–2.85× compared to the performance-driven MR. For balanced congestion on mcc1, our proposed algorithm

improves the result by 1.38–1.48× compared to the routability-driven MR and by 2.72–3.32× compared to the performance-driven MR. For runtime speedup, our approach improves by 1.06× compared to routability-driven MR and by 3.08× com-pared to performance-driven MR.

Further, the interconnection congestion, as evident in the intermodule connections in mcc1 and mcc2, demonstrates the respective maximal and average congestion improvements by 1.39–3.23× and 1.03–2.36× with the congestion balance im-provement (σvand σh, standard deviation respective for vertical and horizontal tiles) by 1.37–2.76×.

Table V compares the results of our work with the crosstalk-and performance-driven router presented in [26]. Our router improves the respective maximal and average congestion im-provements by 1.54–1.84× and 1.17–1.34× with the conges-tion balance improvement (σv and σh, respective standard deviations for vertical and horizontal tiles) by 1.13–1.63×. Please note that we do not compare the maximum (critical path) delay dmax, average delay davgwith [26] since our

congestion-guided router achieves 100% routing completion, whereas the work [26] does not.

C. Process Variation and Coupling Capacitance

To estimate the reduction in ILD thickness variation in CMP effects, we use (2) to evaluate ILD thickness by

(11)

TABLE VII

COMPARISON INMAXIMUMUNIT-LENGTHCOUPLINGCAPACITANCE

assuming enough polishing time (i.e., t > (ρ0z1)/Ki for the maximum ρ0). The results are given in Table VI, assuming

that vertical and horizontal wires are routed in different metal layers. The maximum ILD variation is calculated in each case, and the result is compared with our method. Since a major cause of process variation is the CMP variation [7], [8], it is essential to achieve CMP process uniformity. A simple and effective way to control the ILD variation due to the CMP process is through balanced density routing. Compared with the performance-driven routing [24], the ILD variations in vertical and horizontal metal layers are improved by 9.17× and 19.68×, respectively. The respective average improvements over the routability-driven router [24] in vertical and horizontal metal layers are 1.13× and 1.03×.

The crosstalk noise is induced by signal transition in the aggressor net and the coupling capacitance between adjacent wires. Thus, the best way to avoid crosstalk noise is to reduce coupling capacitance by increasing wire space, as suggested by (3). In Table VII, we show the worst-case unit-length capacitance in each case, as compared to our method. The results in Table VII show that the proposed balanced-density router achieves the best worst-case unit-length coupling capac-itance among all routers. Compared with the routability-driven router [24], our congestion-control router improves the unit-length coupling capacitance by 1.03× to 1.53× with 100% routing completion. Compared with the performance-driven router [24], our congestion-control router improves the unit-length coupling capacitance by 2.85× to 11.83× with also 100% routability. Compared with the multilevel router [26], our congestion-control router improves the unit-length coupling capacitance by 1.13× to 4.29× with also 100% routability. All our experimental results show that reducing congestion is an effective method in reducing coupling capacitance.

V. CONCLUSION

We have shown that the embedded ORT and diagnosis scheme are feasible based on the simulation results with TSMC 0.18-µm process technology. In addition, this ORT scheme achieves 100% fault detection coverage and

maxi-mal diagnosability. We have also presented an effective mul-tilevel routing framework that applies a congestion-driven routing algorithm to reduce the multiple-fault probability, CMP- and OPC-induced effects, and crosstalk effects for yield enhancement.

REFERENCES

[1] Semiconductor Industry Association (SIA), International Technology Roadmap for Semiconductors (ITRS), 2004.

[2] IEEE Standard Testability Method for Embedded Core-based Integrated Circuits, Test Technology Technical Council of the IEEE Computer Soci-ety, IEEE Std. 1500, 2005.

[3] K. S.-M. Li, C. Su, Y.-W. Chang, C.-L. Lee, and J.-E. Chen, “IEEE Standard 1500 compatible interconnect diagnosis for delay and crosstalk faults,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 11, pp. 2513–2525, Nov. 2006.

[4] A. B. Kahng, B. Liu, and I. I. Mandoiu, “Non-tree routing for reliability and yield improvement,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2002, pp. 260–266.

[5] W. Maly, “Moore’s law and physical design of ICs,” in Proc. ACM Int. Symp. Phys. Des., 1998, p. 36. (special address).

[6] V. K. R. Chiluvuri, “Yield optimization in physical design: A review,” in Proc. 5th ACM/SIGDA Phys. Des. Workshop, 1996, pp. 198–206. [7] G. Nanz and L. E. Camilletti, “Modeling of chemical–mechanical

polish-ing: A review,” IEEE Trans. Semicond. Manuf., vol. 8, no. 4, pp. 382–389, Nov. 1995.

[8] Y. Chen, A. B. Kahng, G. Robins, and A. Zelikovsky, “Practical iterated fill synthesis for CMP uniformity,” in Proc. ACM/IEEE Des. Autom. Conf., 2000, pp. 671–674.

[9] L.-D. Huang and M. D. F. Wong, “Optical proximity correction (OPC)-friendly maze routing,” in Proc. ACM/IEEE Des. Autom. Conf., Jun. 2003, pp. 812–817.

[10] K. S.-M. Li, C.-L. Lee, C.-C. Su, and J.-E. Chen, “A unified approach to detecting crosstalk faults of interconnects in deep sub-micron VLSI,” in Proc. Asia Testing Symp., Nov. 2004, pp. 145–150.

[11] G. E. Jan, K.-Y. Chang, S. Gao, and I. Parberry, “A 4-geometry maze router and its application on multiterminal nets,” ACM Trans. Design Autom. Electron. Syst., vol. 10, no. 1, pp. 116–135, Jan. 2005.

[12] D. Hightower, “A solution to line routing problems on the continuous plane,” in Proc. ACM/IEEE Des. Autom. Conf., 1969, pp. 1–24. [13] C. J. Albrecht, “Global routing by new approximation algorithms for

mul-ticommodity flow,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 20, no. 5, pp. 622–632, May 2001.

[14] Y.-W. Chang, K. Zhu, and D. F. Wong, “Timing-driven routing for symmetrical-arraybased FPGAs,” ACM Trans. Design Autom. Electron. Syst., vol. 5, no. 3, pp. 433–450, Jul. 2000.

[15] M. Marek-Sadowska, “Router planner for custom chip design,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 1986, pp. 246–249. [16] J. Cong, J. Fang, and Y. Zhang, “Multilevel approach to full-chip gridless

routing,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2001, pp. 396–403.

[17] C. J. Alpert, J.-H. Huang, and A. B. Kahng, “Multilevel circuit partition-ing,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 17, no. 8, pp. 655–667, Aug. 1998.

[18] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel hyper-graph partitioning: Application in VLSI domain,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 7, no. 1, pp. 69–79, Mar. 1999.

[19] T. F. Chan, J. Cong, T. Kong, and J. R. Shinnerl, “Multilevel optimization for large-scale circuit placement,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2000, pp. 171–176.

[20] H.-C. Lee, Y.-W. Chang, J.-M. Hsu, and H. Yang, “Multilevel large-scale module floorplanning/placement using B∗− trees,” in Proc. ACM/IEEE Des. Autom. Conf., Jun. 2003, pp. 812–817.

[21] M. Hayashi and S. Tsukiyama, “A hybrid hierarchical global router for multi-layer VLSIs,” IEICE Trans. Fundam. Electron. Commun. Comput. Sci., vol. E78-A, no. 3, pp. 337–344, Mar. 1995.

[22] Y.-L. Lin, Y.-C. Hsu, and F.-S. Tsai, “Hybrid routing,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.., vol. 9, no. 2, pp. 151–157, Feb. 1990.

[23] J. Cong, M. Xie, and Y. Zhang, “An enhanced multilevel routing system,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2002, pp. 51–58. [24] S.-P. Lin and Y.-W. Chang, “A novel framework for multilevel routing considering routability and performance,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2002, pp. 44–50.

(12)

[25] R. Kastner, E. Bozorgzadeh, and M. Sarrafzadeh, “Predictable routing,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2000, pp. 110–114.

[26] T.-Y. Ho, Y.-W. Chang, S.-J. Chen, and D.-T. Lee, “A fast crosstalk- and performance-driven multilevel routing system,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2003, pp. 382–387.

[27] B. Stine, “A closed-form analytical model for ILD thickness variation in CMP processes,” in Proc. CMP-MIC, 1997, pp. 266–273.

[28] C. J. Alpert, A. Devgan, and S. T. Quay, “Buffer insertion for noise and delay optimization,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 18, no. 11, pp. 1633–1645, Nov. 1999.

Katherine Shu-Min Li (S’04–M’06) received

the B.S. degree from Rutgers University, New Brunswick, NJ, and the M.S. and Ph.D. degrees from the National Chiao Tung University, Hsinchu, Taiwan, R.O.C., in 2001 and 2006, respectively.

She is currently an Assistant Professor in the Department of Computer Science and Engineer-ing, National Sun Yat-sen University, Kaohsiung, Taiwan. Her research interests focus on crosstalk effects, signal integrity, SOC testing, floorplanning and routing for testability and yield enhancement, design for manufacturing, design for yield, transition faults, scan reordering, scan routing, low-power scan techniques, particularly on oscillation ring test scheme, and interconnect optimization in deep submicron and nanotechnology. Dr. Li is a member of the IEEE Circuits and Systems Society (CAS), the Association for Computing Machinery (ACM), and ACM/Special Interest Group on Design Automation (SIGDA).

Yao-Wen Chang (S’94–M’96) received the B.S.

degree from National Taiwan University, Taipei, Taiwan, in 1988, and the M.S. and Ph.D. degrees from the University of Texas at Austin in 1993 and 1996, respectively, all in computer science.

He is a Professor in the Department of Electrical Engineering and the Graduate Institute of Electronics Engineering, National Taiwan University. He is cur-rently also a Visiting Professor at Waseda University, Japan. He was with the IBM T. J. Watson Research Center, Yorktown Heights, NY, in the summer of 1994. From 1996 to 2001, he was on the faculty of National Chiao Tung University, Taiwan. His current research interests lie in VLSI physical design, design for manufacturing, and FPGA. He has been working closely with industry on projects in these areas. He has coauthored one book on routing and over 120 ACM/IEEE conference/journal papers in these areas.

Dr. Chang received an award at the 2006 ACM ISPD Placement Contest, Best Paper Award at ICCD-1995, and nine Best Paper Award Nominations from DAC-2007, ISPD-2007 (two), DAC-2005, 2004 ACM TODAES, ASP-DAC-2003, ICCAD-2002, ICCD-2001, and DAC-2000. He has received many awards for research performance, such as the inaugural First-Class Principal Investigator Awards and the 2004 Mr. Wu Ta You Memorial Award from the National Science Council of Taiwan, the 2004 MXIC Young Chair Professor-ship from the MXIC Corp, and for excellent teaching from National Taiwan University and National Chiao Tung University. He is an editor of the Journal of Computer and Information Science. He has served on the ACM/SIGDA Physical Design Technical Committee and the technical program committees of ASP-DAC (topic chair), DAC, DATE, FPT (program co-chair), GLSVLSI, ICCAD, ICCD, IECON (topic chair), ISPD, SOCC (topic chair), TENCON, and VLSI-DAT (topic chair). He is currently an independent board member of Genesys Logic, Inc, the chair of the Design Automation and Testing (DAT) Consortium of the Ministry of Education, Taiwan, a member of the board of governors of the Taiwan IC Design Society, and a member of the IEEE Circuits and Systems Society, ACM, and ACM/SIGDA.

Chung-Len Lee (S’71–M’81) received the B.S.

degree in electrical engineering from the National Taiwan University, Taipei, Taiwan, R.O.C., in 1968, and the M.S. and Ph.D. degrees in electri-cal engineering from Carnegie Mellon University, Pittsburgh, PA, in 1971 and 1975, respectively.

In 1975, he joined the National Chiao Tung Uni-versity, Hsinchu, Taiwan, as a Faculty Member, where he was engaged in research on various top-ics on optoelectrontop-ics, semiconductor devices, inte-grated circuits, and processes. He has published more than 300 papers in these areas.

Dr. Lee is a member of Phi Kappa Phi and Phi Tau Phi.

Chauchin Su (S’80–M’90) received the B.S. and

M.S. degrees from the National Chiao Tung University, Hsinchu, Taiwan, R.O.C., in 1979 and 1981, respectively, and the Ph.D. degree from the University of Wisconsin, Madison, in 1990, all in electrical engineering.

In 1990, he joined the Department of Electrical Engineering, National Central University, Jhungli, Taiwan. In 2000, he transferred to the Department of Electrical and Control Engineering, National Chiao Tung University, where he is currently a Professor. His current research interests focus on the design and test of high-speed serial links and mixed signal circuits.

Prof. Su has served as a Technical Program Committee Member of the International Conference on Computer-Aided Design, Asia and South Pacific Design Automation Conference, Asian Test Symposium (ATS), and Inter-national Mixed-Signal Test Workshop. He has also served as the Technical Program Cochair of ATS 2000 and the General Cochair of ATS 2004. Presently, he is the Executive Director of the National System-on-Chip Program, Taiwan.

Jwu E. Chen (S’88–M’92) received the B.S., M.S.,

and Ph.D. degrees in electronics engineering from National Chiao-Tung University, Hsinchu, Taiwan, R.O.C., in 1984, 1986, and 1990, respectively.

He was a Faculty Member in the Department of Electrical Engineering, Chung Hua University, Hsinchu, from 1990 to 2004. He is currently a Faculty Member in the Department of Electrical Engineering, National Central University, Jhungli, Taiwan. His research interests include multiple-valued logic, VLSI testing, reliable computing, yield analysis, and humanoid robotics.