考慮晶片上互感及電阻、電感、電容雜訊之漏話導向電路佈局

全文

(1)國立交通大學電信工程學系碩士論文. 考慮晶片上互感及電阻、電感、電容雜訊之漏話導向電路佈局 Crosstalk-Driven Placement with Considering On-Chip Mutual Inductance and RLC Noise. 研究生：邱震軒指導教授：李育民教授. 中華民國九十四年十月.

(2) 考慮晶片上互感與電阻、電感、電容雜訊之漏話導向電路佈局 Crosstalk-Driven Placement with Considering On-Chip Mutual Inductance and RLC Noise. 研究生：邱震軒. Student：Chen-Hsuan Chiu. 指導教授：李育民. Advisor：Yu-Min Lee. 國立交通大學電信工程系碩士論文. A Thesis Submitted to Department of Communication Engineering College of Electrical Engineering and Computer Science National Chiao Tung University in partial Fulfillment of the Requirements for the Degree of Master in. Communication Engineering October 2005 Hsinchu, Taiwan, Republic of China. 中華民國九十四年十月.

(3) 考慮晶片上互感及電阻、電感、電容雜訊之漏話導向電路佈局. 學生：邱震軒. 指導教授：李育民博士. 國立交通大學電信工程學系碩士班. 摘. 要. 當深次微米技術演進至0.18 微米之下，雜訊效應成為電路設計者所無法忽視的一個重要問題。本論文提供一新穎的漏話導向之電路佈局演算法，用於消減晶片上因互感及電阻、電感、電容所產生之雜訊。我們將證明在佈局時僅考慮因電阻與電容所引起之雜訊，將過分樂觀化實際電路所產生之雜訊效應。實驗結果說明，我們提供之演算法相較於面積導向之佈局法，僅平均多增加8.4%的面積，但卻減低了44.1%的機率雜訊值、並縮短了30.1%的總估計線長。而相較於壅塞導向與電阻、電容雜訊導向之佈局法，我們也分別平均改善了15.9%、8.9%的機率雜訊值，以及縮短14.9%與6.8%的總估計線長。. i.

(4) Crosstalk-Driven Placement with Considering On-Chip Mutual Inductance and RLC Noise Student：Chen-Hsuan Chiu. Advisor：Dr.Yu-Min Lee. Department of Communication Engineering National Chiao Tung University. ABSTRACT. As the deep-submicron technologies scale down to 0.18 µm, the crosstalk noise has become a critical issue which designer cannot neglect. In the thesis, a novel crosstalk-driven placement algorithm for on-chip mutual inductance and RLC noise consideration will be proposed. We also demonstrate that only take account of the RC noise during placement will be excessively optimistic in the noise effects produced by designed circuits. Results show that our approach can reduce 44.1% probabilistic RLC noise and improve 30.1% total estimated wirelength on average than the area-driven placement only at the cost of 8.4% increase of total area. For the congestion-driven and RC-driven placement, our algorithm also achieves 15.9% and 8.9% improvement on average in probabilistic RLC noise, and averagely minimizes 14.9% as well as 6.8% total estimated wirelength, respectively.. ii.

(5) 誌. 謝. 在這篇論文能順利完成的當下，首先最需感謝的是我的指導教授李育民博士，老師專業的背景總能適時點出我在研究上的思考盲點，引導我深入思索問題的根源、進而解決問題。也因老師的帶領下，讓我在碩士的二年裡，學習到做研究應有的正確態度與方法，我相信這對我的影響絕非僅止於研究所生涯，對我未來的成長更是受益無窮。除此之外，並謝謝台灣大學張耀文教授實驗室所提供的 B*-tree 原始碼，使得本研究能在此良好架構上繼續發揮，最後並得以順利付梓。再者，需感謝我實驗室的夥伴 ─ 至鴻學長、義琅、逸宏及培育，有你們的陪伴及寶貴的意見，才能使我在做研究中更快發現自己所忽略的問題，觸發靈感來解決它。最後我要深深感激的，是一直在旁支持我的家人及我摯愛的女友宜驊，你們總是在我情緒最低落時默默地陪著我、無悔地伴我一起走過，也因為有了你們，這一切的辛苦終於得到回報，我的喜悅也才得以分享!!. iii.

(6) Contents 1. 2. 3. Introduction 1.1 Noise Fundamentals . . . . . . . . . . . . . 1.2 Conventional Solutions for Signal Integrity 1.3 Motivation . . . . . . . . . . . . . . . . . . 1.4 Our Contribution . . . . . . . . . . . . . . 1.5 Organization of this Thesis . . . . . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. 1 1 3 6 7 8. Preliminaries 2.1 Traditional VLSI Design Flow . . . . . . . . . 2.2 Basic Concept of Placement . . . . . . . . . . 2.3 Probabilistic Model for Congestion Estimation 2.4 Analytical RLC Model for Noise Estimation . . 2.5 B*-Tree for Placement Representation . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. 9 9 10 11 16 20. Crosstalk-Driven Placement 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . 3.2 Probabilistic RLC Noise Estimation . . . . . . . . . . . 3.2.1 Procedure Flow of Probabilistic Noise Estimation 3.2.2 Average Circuit Elements Calculation . . . . . . 3.2.3 Peak Noise Estimation . . . . . . . . . . . . . . 3.2.4 Overall Average Probabilistic Noise Estimation . 3.3 Upper Bound of the Coexisting Probability . . . . . . . 3.4 Partial Estimation . . . . . . . . . . . . . . . . . . . . . 3.4.1 Partial Congestion Estimation . . . . . . . . . . 3.4.2 Partial Probabilistic Noise Estimation . . . . . . 3.5 Algorithm Flow of Crosstalk-Driven Placement . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. 22 22 24 25 26 28 34 34 37 37 39 41. . . . . .. 4. Experimental Results. 43. 5. Conclusion. 53. iv.

(7) List of Figures 1.1 1.2 1.3 1.4. The configuration of two coupled wires. . . . . . . . . . . . . . . . . . . . . Supply noise and IR-drop on a P/G network. . . . . . . . . . . . . . . . . . . Charge-sharing between device M1 and M2. . . . . . . . . . . . . . . . . . . Buffer insertion in an RLC wire for mitigation of its noise and propagation delay.. 2 3 4 5. 2.1 2.2. Traditional VLSI design flow. . . . . . . . . . . . . . . . . . . . . . . . . . Comparison of the total wirelength. (a) A placement with longer total wirelength. (b) A placement with shorter total wirelength. . . . . . . . . . . . . . Various shapes of the two-pin nets. . . . . . . . . . . . . . . . . . . . . . . Offgrid pins of a two-pin net. . . . . . . . . . . . . . . . . . . . . . . . . . The configuration of two coupled wires. . . . . . . . . . . . . . . . . . . . . Reflection waves in the two coupled transmission line. . . . . . . . . . . . . . (a) An admissible placement. (b) The corresponding B*-tree representation for the placement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10. 2.3 2.4 2.5 2.6 2.7 3.1 3.2 3.3. 3.9. Algorithm flowchart of the general placement. . . . . . . . . . . . . . . . . . Algorithm flowchart of our crosstalk-driven placement. . . . . . . . . . . . . . Procedure flow of probabilistic noise estimation. (a) Summarized flow. (b) Detailed flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-view of parallel wires. (a) Cross-section diagram of parallel wires on the top layer. (b) Cross-section diagram of parallel wires in the middle layers . . . . The probabilistic noise estimation between victim net A and aggressor net B. . . Reflection behavior in the near-end (source) and far-end (tail) of the victim wire. Sum up turning points of each reflective wave to pick out the peak noise. . . . . Re-check the blocks whose placement orders after that of the selected nodes. (a) Node n4 and n5 are selected during a B*-tree perturbation. (b) The corresponding placement order. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustration of the partial congestion estimation procedure. . . . . . . . . . . .. 4.1 4.2 4.3 4.4 4.5 4.6 4.7. The RLC-driven placement configuration of apte. . . The RLC-driven placement configuration of hp. . . . The RLC-driven placement configuration of xerox. . . The RLC-driven placement configuration of ami33. . The RLC-driven placement configuration of ami49. . The RLC-driven placement configuration of ckt529. . The RLC-driven placement configuration of ckt1681.. 3.4 3.5 3.6 3.7 3.8. v. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 11 13 13 16 19 21 23 23 25 27 29 30 32. 38 39 49 49 50 50 51 51 52.

(8) List of Tables 1.1. Peak coupling noise with and without considering mutual inductance. . . . . . .. 7. 2.1. Algorithm of the probabilistic model. . . . . . . . . . . . . . . . . . . . . .. 15. 3.1 3.2 3.3 3.4 3.5. Separating probability of two nets in a grid with 10 available routing tracks. Ramp waves in the victim wire shown in Fig. 3.6. . . . . . . . . . . . . Algorithm of probabilistic RLC noise estimation. . . . . . . . . . . . . Algorithm of the partial congestion estimation. . . . . . . . . . . . . . . Algorithm of the proposed crosstalk-driven placement. . . . . . . . . . .. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10. Number of cells and nets of MCNC and our benchmarks. Placement flow of the area-driven placement. . . . . . Placement flow of the congestion-driven placement. . . Placement flow of the RC-driven placement. . . . . . . Placement flow of the RLC-driven placement. . . . . . Results of area-driven placement. . . . . . . . . . . . Results of congestion-driven placement. . . . . . . . . Results of RC-driven placement. . . . . . . . . . . . . Results of RLC-driven placement. . . . . . . . . . . . Peak probabilistic RLC noise of each benchmark. . . .. vi. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. . . . . .. . . . . .. . . . . .. 29 31 35 40 42. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . .. 43 44 45 45 46 47 47 48 48 48.

(9) Chapter 1 Introduction This chapter gives basic concepts of the signal integrity problems. Since many researches have detected that the on-chip noise becomes very seriously in the deep-submicron era, we first construct the noise fundamental concepts in Chapter 1.1. Then, various of the conventional solutions for signal integrity problems are illustrated in Chapter 1.2. Finally, the motivations of this research, our contributions, and the organization of this thesis are stated in the Chapter 1.3, 1.4, and 1.5, respectively.. 1.1. Noise Fundamentals. In today’s VLSI (Very-Large Scale Integrated circuit) design, increase of the circuit complexity and wire congestion make coupling effects between interconnects to be more severely than before. Especially, when the technology scales down to 0.18µm [3] or the duty frequency up to GHz, the signal integrity and coupling noise have become a critical issue that designers cannot ignore anymore. Generally, the kinds of on-chip noise includes [1, 2, 4]: • Interconnect coupling noise: Coupling noise, or crosstalk, is primarily due to capacitive and inductive coupled between metal wires. In Fig. 1.1, two parallel wires are modeled as the RLC model with an active voltage Vs (t) and lumped capacitive loads, and the coupled noise interferes with the victim line by means of the coupling capacitance (Cx ) and mutual inductance (Lx ). We can see that the voltage of 1.

(10) L1. R1. Vs(t). Aggressor. ●. ●. C1 Cx. Lx. R2. ●. Victim VC2(t) L2. C2. Fig. 1.1: The configuration of two coupled wires. the victim’s far-end is not still quiet but with a voltage fluctuation. Actually, when the clock rate of a circuit speeds up, the inductive coupling noise will dominate the noise effects of a circuit. • Supply noise: The voltage noise on the supply is due to other wires switch nearby or the local IR-drop. IR-drop is a voltage fluctuation because of the resistance of the on-chip power delivery network. In Fig. 1.2, each nodal voltage of the power/ground(P/G) network should be ideally equal to VDD . However, with considering the resistance of the interconnect, each nodal voltage of the P/G network will be less than VDD . This kind of noise may malfunction circuits, and detriment the signal integrity of the designs. • Charge-sharing noise: Charge injection is due to a new circuit path to a diffusion capacitance at different voltage, and it can induce a small pulse noise in the circuit to cause malfunction. Fig. 1.3 shows two NMOS M1 and M2 with their grounded diffusion capacitance, C1 , C2 , and C3 . At time 2, since M1 turns off and M2 turns 2.

(11) VIR-drop VDD. Branch current / voltage. ●. current. ●. Nodal voltage. Wire width. Fig. 1.2: Supply noise and IR-drop on a P/G network. on, the voltage of V3 makes a redistribution called charge-sharing. Therefore, the logic degree of V2 and V3 may be out of designer’s anticipation. • Source-drain leakage: In deep-submicron techniques, the threshold voltage of device becomes much lower than before (about 0.15V), and MOS cannot turn off completely in the cut-off region. It induces a few source-drain leakage current when devices turn off, and the noise is occured.. 1.2. Conventional Solutions for Signal Integrity. Those on-chip noise introduced in Section 1.1 can injure the signal integrity of circuits seriously, even fail our design. In order to conquer those noise, several useful solutions are proposed as follows[1, 5]. • Minimize the coupling range: It is an instinctively solution to mitigate the noise between interconnects. Increasing the space of wires can decline the coupling capacitance and mutual inductance between wires, and the crosstalk noise can be 3.

(12) A. B V3. V2 Vi. ●. ●. ●. M2. M1 C1. C3. C2. time. Vi. VA. V2. VB. V3. 0. 1. 1. 1. 1. 1. 1. 0. 1. 0. 0. 1. 2. 0. 0. 1/2. 1. 1/2. Charge sharing !!. Fig. 1.3: Charge-sharing between device M1 and M2. minimized. Nevertheless, with considering the effects of mutual inductance, many researches [5, 6, 7, 9, 24] showed that to increase the space is not a good solution to deal with the inductive coupling noise. • Differential signals: If designers know the switching patterns of the bus or signal wires, the signals tending to switch in the same direction should be interleaved by others switching in the opposite direction. If a wire has stronger inductive coupling, designers should minimize the amounts of wires switching in the same direction by interleaving a wire with its logic inversion (signal-bar). On the contrary, if the capacitive coupling noise of a wire is stronger, we should avoid inserting a wire with its logic inversion right next to it. Because to do so will be slow down the signal and increase the noise if the wire with stronger inductive coupling effects. The method can provide enough nearby current return paths for fast signals, and usually suit the clock wires. • Buffer insertion: It is another effective technique to mitigate the on-chip noise and 4.

(13) 1. n. ● ● ● ● ● ●. L/n. L/n Rl / n Cl / n Ll / ( k(n/2) ). k = 2 ~ 2.5. Fig. 1.4: Buffer insertion in an RLC wire for mitigation of its noise and propagation delay. improve the timing. The original long wires can be shortened by inserting buffers to minimize the parasitics of wires. Fig. 1.4 illustrates n buffers uniformly inserted into an RLC wire. Shortening the wirelength makes the resistance, capacitance, and inductance of the wire scale down. Therefore, the RC-delay and coupling noise between parallel interconnects can be controlled. • Shielding wires insertion: This is the most common and efficient solution to overcome the crosstalk noise. Adding a P/G wire (or called shielding wire) near a critical line canl provide a current return path for it, and make the coupling effects between wires decline quickly. However, the width of the P/G wire is much larger than signal wires, and it costs more routing resource to improve the signal integrity of circuits.. 5.

(14) 1.3. Motivation. With recent advances of deep-submicron technology, crosstalk noise has become the major problem to affect the behavior of VLSI design. Neglecting the on-chip noise will usually make our design failed. Traditional works mainly adopt post-extraction after routing, then perform noise analysis to verify the signal integrity of the design. If the result unfits the noise constraint,designers will rectify the circuit topology repeatedly. This design flow is not efficient and unsuitable for today’s VLSI design. In order to overcome the problems of signal integrity, researchers have believed that considering the coupling effects of circuits during placement stage is a more effective flow, and several crosstalk-driven placement topics [12, 14, 15] have been investigated. [12] employed a congestion map based on the probabilistic model[13] to control the routing congestion during placement, used a quick global router by skipping the layer assignment phase to estimate routing topology, and calculated the average coupling capacitance for each wire segment. Eventually, a coupling capacitance map was generated to guide the placement. [14] contended that the coupling capacitance map cannot completely indicate the noise behavior during placement. First, a global router was used to produce a global congestion map, then the coupling capacitance of each wire segment was extracted . Finally, these extracted capacitances were used to produce a noise map based on the RC model [10] to guide the placement. [15] proposed a GA-based (Genetic Algorithm) crosstalk-driven placement that had two-level hierarchical structure, outline and detailed levels, to improve the coupling capacitance noise, RC-delay, and power consumption. In the crosstalk estimation, the coupling capacitance was determined between the aggressor and victim according to the states of signals switching. However, all of the previous researches in crosstalk-driven placement only take account of the RC coupling noise, the effects of the mutual inductance have not been con6.

(15) space 1 space 2 space 3 space 4 space 5 space 6 space 7. Peak noise without Lx 0.2352V 0.1087V 0.0617V 0.0395V 0.0273V 0.0200V 0.0153V. Peak noise with Lx 0.3276V 0.3109V 0.3007V 0.2954V 0.2920V 0.2897V 0.2879V. Table 1.1: Peak coupling noise with and without considering mutual inductance. sidered when estimating crosstalk noise. In order to demonstrate the criticalation of the mutual inductance, we implement the RLC noise model proposed in [11] to estimate the peak coupling noise between two identical wires with 0.13µm / 1.2V technology. The wire width, thickness, length, unit of space between wires, aggressor resistance, victim resistance, and the input rising time are 0.16µm, 0.28µm, 3000µm, 0.18µm, 75Ω, 50Ω, and 100ps, respectively. Table 1.1 shows the peak coupling noise with and without considering the mutual inductance (Lx ). We can see that the estimated coupling noise without considering mutual inductance drops quickly, and as wires are separated more than 5 space units, the coupling noise only remains 10% of the 1 unit of space. However, with considering the mutual inductance, the coupling noise declines very slowly. This phenomenon illustrates that the mutual inductance dominates the coupling noise in deep-submicron technologies, and without considering the mutual inductance during the crosstalk-driven placement will extremely underestimate the crosstalk noise.. 1.4. Our Contribution. In this thesis, a placement which is capable of considering the on-chip RLC noise is proposed. Our main contribution include: • During placement, the proposed placement method introduces a novel technique that can deal with the crosstalk noise due to the on-chip mutual inductance. First, a 7.

(16) probabilistic model is utilized for congestion estimation. Then a transmission-line based RLC model [11] is employed to estimate the peak coupling noise between wires. With the information of congestion and peak noise, an effective algorithm to compute the worst-case average probabilistic noise is developed to guide the placement. • Our placer utilizes the probabilistic model proposed in [13] to estimate the routing congestion during placement. If the routing demand is larger than the grid capacity, we deem that the probability of two nets (net A and net B) coexisting in this grid is not still equal to PA × PB . Hence, we propose a set of equations to calculate the upper bound of the coexisting probability of two nets during estimating their RLC crosstalk noise.. 1.5. Organization of this Thesis. The rest of this thesis is organized as follows. Chapter 2 introduces a probabilistic model [13] for congestion estimation, an RLC analytical model [11], and B*-tree representation [16]. Chapter 3 describes the proposed algorithm flow. In Chapter 4, we compare the experimental results of area-driven, congestion-driven, RC-driven, and RLC-driven placement. Finally, the conclusions are given in Chapter 5.. 8.

(17) Chapter 2 Preliminaries This chapter introduces several background knowledge that will be used in our crosstalkdriven placement. We first introduce the traditional VLSI design flow and the basic concept of placement. Next, the probabilistic model for our congestion estimation is stated. In Chapter 2.4, we will introduce a novel transmission line based RLC model for the onchip RLC noise estimation during our placement. Finally, the B*-tree representation is illustrated in Chapter 2.5.. 2.1. Traditional VLSI Design Flow. The traditional VLSI design flow shown in Fig. 2.1 [22]. First, designers synthesize their circuit by several synthesizers, Verilog or VHDL. Then the synthesized circuits should be partitioned into several blocks according to the circuit functions or achieve the minimum number of cut between blocks. In the floorplan and placement stage, each function block is placed on the proper location where to achieve the minimum total area, wirelength, congestion, crosstalk, or power consumption, etc. After placement stage, the routing stage is performed. In general, this stage emphasizes the routability, wiring congestion, and timing improvement. When the routing is complete, the compaction, extraction, and circuit verification is performed to minimize the total area and verify the performance as well as signal integrity, respectively. Finally, taping out and finishing the design. 9.

(18) Circuit synthesis Physical design. Partitioning Floorplan & Placement Routing Compaction Extraction & Verification Fabrication. Fig. 2.1: Traditional VLSI design flow. In addition, the design stage 2 ∼ 6 (partition stage ∼ extraction & verification stage) belong to the physical design region.. 2.2. Basic Concept of Placement. In the physical design flow, placement is a crucial stage that affects the performance of the design. A good placement topology can achieve the best performance, and the minimum area, propagation delay, wirelength as well as crosstalk noise, etc. For example, Fig. 2.2 illustrates two placement solutions with different total wirelength. The placement topology shown in Fig. 2.2(a) with the longer total wirelength, but if we change its topology to Fig. 2.2(b), we can obtain the better wirelength solution. However, to balance the above constraint, even obtain the optimum solution of placement is a NP-complete problem. In order to solve the problem and determine the proper location of each blocks, many algorithms of placement have been proposed, such as forcedirected, simulated-evolution, and simulated-annealing (SA) algorithm, etc [22, 23]. Because of the popularity and usefulness of the SA algorithm, and it is also capable to obtain 10.

(19) A. C. B. B D. D. C. A E. E. (b). (a). Fig. 2.2: Comparison of the total wirelength. (a) A placement with longer total wirelength. (b) A placement with shorter total wirelength.. the optimum solution of placement, we choose this algorithm for our placement.. 2.3. Probabilistic Model for Congestion Estimation. Although the global router can be used to accurately estimate the wire congestion, it is too costly. Therefore, an efficient probabilistic model [13] is adopted to estimate the congestion. This model is correct and more efficient to help us predict the routing congestion during placement. We first divide a design into some uniform rectangular grids proportional to its core area, then analyze the congestion in every grid. Before understanding how the model works, there are several definitions should be known as follows. • Definition 1: The capacity of a grid is defined as the number of allowable routing tracks within a grid, and it includes the horizontal and vertical capacity shown in the following equations, respectively. Here, the number of horizontal layers is N h , the number of vertical layers is N v , and the minimum pitches for the ith horizontal 11.

(20) and vertical layer are Lhi and Lvi , respectively. Also, it assumes that the width and height of each grid are W idth and Height. h. horizontal capacity = Height × vertical capacity = W idth ×. N X 1. (. i=1 Nv X. (. i=1. Lhi. ). 1 ) Lvi. (2.1) (2.2). • Definition 2: The usage of a grid is defined as the number of used routing tracks within a grid. Similar to definition 1, the usage of each grid also includes the horizontal and vertical one, respectively. • Definition 3: If the total horizontal or vertical usage of each net i within grid(m,n) is lager than its capacity, this grid is congested, that is P. H usagei. i∈(m,n). capacityH of grid(m, n) P. >1. (2.3). >1. (2.4). V usagei. i∈(m,n). capacityV of grid(m, n). where the H and V denote the horizontal and vertical direction, respectively. • Definition 4: F(m,n) is the total number of possible ways to optimally route a twopin net covering an m × n mesh that is the minimum routing region of the net. m+n−2 m+n−2 F (m, n) = Cm−1 = Cn−1. (2.5). • Definition 5: We define 3 types of shapes for the two-pin nets employed in the model, that is short net, flat net, and 3rd type net. Fig. 2.3 illustrates the net topologies of these types. Short net is a two-pin net whose source and sink are within the same grid. If a two-pin net whose source and sink are within the same row or column of grid, it is called the flat net. Otherwise, if the two-pin net covers more than 1 row and 1 column of grid, we call it the 3rd type net. 12.

(21) ●. ●. 3rd type short ●. ●. ● ●. flat ●. ●. Fig. 2.3: Various shapes of the two-pin nets. After the definition, the horizontal (PH (i, j)) and vertical usage (PV (i, j)) of each shape of the two-pin net within its grid can be calculated by the following equations, respectively. Further, the W idth, Height, dx1 , dx2 , dy1 , and dy2 are illustrated in Fig. 2.4.. dx2. ●. dy2. Width Height. dy1 ●. dx1. Fig. 2.4: Offgrid pins of a two-pin net.. 13.

(22) • Short net: |dx1 + dx2 − W idth| W idth |dy1 + dy2 − Height| PV (i, j) = Height. PH (i, j) =. (2.6) (2.7). • Flat net: If the source and sink of the net are in the same column, and they also covers m rows of grids, then its horizontal and vertical usage within grid(i,j) can be written as: (. PH (i, j) =. PV (i, 1) =. |dx1 +dx2 −W idth| , 2×W idth. 0,       . dy1 , Height dy2 , Height. 1,. i = 1, m otherwise. i=1 i=m otherwise. (2.8). (2.9). Similarly, if the pins of the two-pin net are in the same row, and they also covers n columns of grids, then its probabilistic usage with grid(i,j) can be computed by:   . dx1 , W idth dx2 , W idth. (. |dy1 +dy2 −Height| , 2×Height. PH (1, j) =   PV (1, j) =. 1,. j=1 j=n otherwise 0,. j = 1, n otherwise. (2.10). (2.11). • 3rd type net:. PH (i, j) =.                  . 1 × F (m, n)                 . x1 , F (m, n − 1) × Wdidth dx2 F (m, n − 1) × W idth , dx2 , W idth dx1 , W idth x1 , F (m − i + 1, n − 1) × Wdidth dx2 F (i, n − 1) × W idth , F (m,n−j+1)+F (m,n−j) , 2 F (m,j)+F (m,j−1) , 2. i = 1, j = 1 i = m, j = n i = 1, j = n i = m, j = 1 1 < i < m, j = 1 1 < i < m, j = n i = 1, 1 < j < n i = m, 1 < j < n F (i,j)F (m−i+1,n−j)+F (i,j−1)F (m−i+1,n−j+1) , otherwise 2. 14.

(23) PV (i, j) =.                     . 1 × F (m, n)                    . F (m − 1, n) × F (m − 1, n) × dy1 , Height dy1 , Height. dy1 , Height dy2 , Height. i = 1, j = 1 i = m, j = n i = 1, j = n i = m, j = 1 F (m−i+1,n)+F (m−i,n) , 1 < i < m, j = 1 2 F (i,n)+F (i−1,n) , 1 < i < m, j = n 2 dy1 F (m − 1, n − j + 1) × Height , i = 1, 1 < j < n dy2 i = m, 1 < j < n F (m − 1, j) × Height , F (i,j)F (m−i,n−j+1)+F (i−1,j)F (m−i+1,n−j+1) , otherwise 2. Finally, the probabilistic model can be implemented by the algorithm illustrated in Table 2.1: Algorithm of Probabilistic Model 1 Begin 2 Compute the capacity of each grid 3 Compute the F(m,n) matrix 4 For each net in the design 5 MST(net) 6 For each segment of the MST 7 Determine the size of mesh 8 Compute the horizontal and vertical usages within its grids 9 EndFor 10 EndFor 11 For each grid in the design 12 Compute the congestion of the grid 13 EndFor 14 End. Table 2.1: Algorithm of the probabilistic model. For the algorithm shown in Table 2.1, the preparations for the model are the stage 1∼3. For each net in the design, if it is a multi-pin net, we make a decomposition by the Minimum Spanning Tree technique (MST) [8]. Then for each two-pin net within its routing grids, we calculate its horizontal and vertical usages by the above equations. This process can be computed in constant time with precomputed F(m,n) matrix. Finally, for each grid in the design, its congestion is computed by Equation (2.4). Assume that the number of nets in a design is n, the size of the grids is m × m, and 15.

(24) the maximum number of pins for any net is p. The overall runtime complexity of this model is O(np2 + m2 np). Suppose if the grid size and the maximum number of pins are constants, the runtime is linear with respect to the amounts of nets in the design.. 2.4. Analytical RLC Model for Noise Estimation. Given two parallel interconnects, they can be modeled as a transmission-line based model [11] illustrated in Figure 2.5, where the R, L, C, Cx , Lx are the unit resistance, inductance, capacitance, coupling capacitance, and mutual inductance of the wires, respectively. This figure shows two coupled interconnects which one line is active and the other is quiet. The active line is denoted as “aggressor”, and the quiet line is the “victim”. The driver of aggressor is modeled as a ramp voltage Vs with an equivalent resistance Rs . The driver of victim is represented as an equivalent resistance Rv connected to ground. The sink at the far-end of each wire is modeled as a lumped capacitive load.. Aggressor. Rs. R, L, C. Vs(t) ●. Lx. Cx ●. Rv Victim. z z=l. z=0. Fig. 2.5: The configuration of two coupled wires. For two non-identical wires, that means, they have different line parasitics. The unit line parasitics for the aggressor are R1 , L1 , and C1 and those for the victim are R2 , L2 , 16.

(25) and C2 . At any point z along the wire, the voltage and current waveforms on the aggressor (line 1) and victim (line 2) satisfy the following set of differential equations: ∂V1 ∂z ∂V2 − ∂z ∂I1 − ∂z ∂I2 − ∂z −. = (R1 + sL1 )I1 + sLx I2 = (R2 + sL2 )I2 + sLx I1 = (C1 + Cx )V1 + sCx V2 = (C2 + Cx )V2 + sCx V1. (2.12). Due to the far-end reflection coefficient is around +1 [24], the generic solution for the above set of differential equations is given by: V1 = A1 (e−γe z + eγe z ) + A3 (e−γo z + eγo z ) V2 = A2 (e−γe z + eγe z ) + A4 (e−γo z + eγo z ) A1 −γe z (e − eγe z ) + Z0e1 A2 −γe z (e − eγe z ) + = Z0e2. I1 = I2. A3 −γo z (e − eγo z ) Z0o1 A4 −γo z (e − eγo z ) Z0o2. (2.13). For simplicity, we first consider the case of lossless lines, that is, the wire resistance R1 = R2 = 0. The even and odd mode propagation constant γe and γo are. γe = s γo = s. v q u u (a + a ) + (a − a )2 + 4b b t 1 2 1 2 1 2. 2 v q u u (a + a ) − (a − a )2 + 4b b t 1 2 1 2 1 2. 2. (2.14). where a1 = L1 (C1 + Cx ) − Lx Cx a2 = L1 (C2 + Cx ) − Lx Cx b1 = −L1 Cx + Lx (C2 + Cx ) b1 = −L2 Cx + Lx (C1 + Cx ). 17. (2.15).

(26) In the solution of Equation (2.13), the coefficients are related as: (a1 − a2 ) + A1 = A2. q. (a1 − a2 ) − A3 = A4. q. (a1 − a2 )2 + 4b1 b2 2b2 (a1 − a2 )2 + 4b1 b2 2b2. (2.16). The characteristic impedences of the aggressor and victim line can be written as: Z0e1 =. s(L1 L2 − L2x ) A2 Lx ) γe (L2 − A 1. Z0e2 =. s(L1 L2 − L2x ) A1 Lx ) γe (L2 − A 2. Z0o1 =. s(L1 L2 − L2x ) A4 Lx ) γo (L2 − A 3. Z0o2. s(L1 L2 − L2x ) = A3 γo (L2 − A Lx ) 4. (2.17). Then the boundary conditions are given by Vs − V1 (z = 0) Vs − (A2 + A4 ) = = Rs A2 3 I1 (z = 0) + ZA0o2 Z0e2 −(A2 − A4 ) −V2 (z = 0) = Rv = A2 3 I2 (z = 0) − ZA0o2 Z0e2. (2.18). Applying the boundary conditions to solve Equation (2.13), we can obtain the voltage steps traveling on the victim line: A2 =. A4 =. +Rs 1 (A )( Z0e1 ) A2 Z0e1. +Rs 3 )( Z0o1 ) (A A4 Z0o1. −. Vs (t) A3 +Rs 0e2 +Rv ( A4 )( ZZ0o2 )( Z0o1 ) )( ZZ0o2 +Rv Z0o1 0e2. −. Vs (t) A1 +Rv +Rs 0e2 ( A2 )( ZZ0o2 )( Z0e1 ) )( ZZ0o2 Z0e1 0e2 +Rv. (2.19). For line length l, the step propagating with the even mode constant arrives at the farend after an even time of flight tf e , and the step propagating with the odd mode constant arrives at the far-end after an odd time of flight tf o . tf e = l tf o = l. v q u u (a + a ) + (a − a )2 + 4b b t 1 2 1 2 1 2. 2 v q u u (a + a ) − (a − a )2 + 4b b t 1 2 1 2 1 2. 2 18. (2.20).

(27) The coupling noise in the victim’s far-end is composed of two ramp waves, 2A2 (t − tf e ) and 2A4 (t − tf o ), which is illustrated in Fig. 2.6.. t = 0 t =Tr. t = tfe t = tfo. A2(t). Tr. -A4(t) Vs(t). 2A2(t-tfe). -2A4(t-tfo). Vagg (t ) = 2 A2 (t − t fe ) − 2 A4 (t − t fo ). ●. ●. Rs. Aggressor Vvic (t ) = 2 A2 (t − t fe ) + 2 A4 (t − t fo ). Rv. ●. t = 0 t =Tr. Victim. t = tfe t = tfo. A2(t). 2A2(t-tfe) 2A4(t-tfo). A4(t). Fig. 2.6: Reflection waves in the two coupled transmission line. This figure shows the fist reflection behavior of the wires. Based on the above equations, the far-end waveforms of the aggressor and victim can be computed by the following steps: • Given an input ramp Vs (t), the even and odd mode ramps, A2 (t) and A4 (t), which can be calculated by Equation (2.19), respectively. • The ramp A2 (t) reaches the far-end with a time of flight delay tf e , and A4 (t) reaches the far-end with a time of flight delay tf o . • Due to the reflection coefficient of the far-end is +1, the ramp voltages are double to the far-end of the aggressor and victim line, respectively. • Superposition of the even and odd mode ramps occuring in the far-end of line to obtain the waveform for the aggressor and victim line. 19.

(28) • Reverse traveling waves will be reflected at the near-end and add to the far-end waveforms after three time of flight delays. Performing the above steps in Fig. 2.6, it shows that ramp voltages A2 (t) and A4 (t) are produced in the near-end of lines. These ramps travel with different velocities and reach the far-end after different time delays. After performing the superposition, the output waveforms then be computed by: Vagg (t) = 2A2 (t − tf e ) − 2A4 (t − tf o ) Vvic (t) = 2A2 (t − tf e ) + 2A4 (t − tf o ). (2.21). where Vagg (t) and Vvic (t) are the waveforms at the output of the aggressor and victim, respectively. Furthermore, for a lossy transmission line, that means, the wire resistance is not equal to zero. The peak coupling noise of the far-end of a wire can be calculated by Equation (2.22). Here, R, V + , V − are the unit wire resistance, positive and the negative peak, respectively. − 2ZR. + − Vloosy = Vlossless ×e. 0o. − 2ZR. + − + (Vlossless − Vlossless )×e. − 2ZR. − − Vloosy = Vlossless ×e. 2.5. 0e. (2.22). 0o. B*-Tree for Placement Representation. In this thesis, B*-tree [16] is used for our placement representation. A B*-tree is an ordered tree for modeling a slicing or a non-slicing placement. Given an admissible placement [17] (that means, no blocks can be moved left or down), an unique B*-tree can be constructed in linear time to model the placement. Fig. 2.7 indicates an admissible placement and its corresponding B*-tree. The root of the B*-tree corresponds to the block on the bottom-left corner. Similar to DFS (Deep-First Search), it constructs a B*-tree T for an admissible placement in a recursive procedure: 20.

(29) n0. b6. b7 b5. b2 b3 b0. n2. n1 n3. n5. n6. b4 n7. n4. b1. (b). (a). Fig. 2.7: (a) An admissible placement. (b) The corresponding B*-tree representation for the placement.. Beginning from the root, then it recursively construct the left subtree and then the right one. For example, in order to construct a corresponding B*-tree shown in Fig. 2.7(a), we first pick n0 , the root of T , and place b0 on the bottom-left corner. Then traversing the left child of n0 , which is n1 . b1 is placed on the right of its parent, b0 . Because n1 does not any left children, then the next one for chosen is the n3 , and place it on the top of b1 . Recursively repeating the process in the DFS procedure, then the corresponding admissible placement can be obtained. B*-tree is an efficient representation for placement. It achieves a smaller area due to the admissible placement structure. Moreover, it improves the run time complexity more than the O-tree representation [17]. That is why we adopt B*-tree representation for our placement.. 21.

(30) Chapter 3 Crosstalk-Driven Placement In this chapter, we will introduce the algorithm flow of our proposed placement. We first state the problem formulation of the research, then point out the main difference between ours and the general placement. In Chapter 3.2, a novel technique for on-chip RLC noise estimation is proposed, we utilize the algorithm to estimate the probabilistic noise during placement. Then we also develop a set of equations to compute the upper bound of the coexisting probability between two nets within a routing grid if it is likely overflowing. Finally, the whole algorithm of our crosstalk-driven placement is integrated in Chapter 3.5.. 3.1. Problem Formulation. The problem formulation of the crosstalk-driven placement can be formulated as follows. • Input: Given a fixed chip boundary area A, a set of blocks B = {b1 , b2 , · · · , bn }, their pins’ locations P = {p11 , p21 , · · · , p12 , p22 , · · · , p1n , p2n , · · · , pnn }, and a netlist N to represent the connection relations of each pin in P . • Object: Determining an optimal location for each bi in A, where bi ∈ B, make the average probabilistic RLC noise between each net in N is the minimum. • Output: Output the locations of each bi (bi ∈ B) and pi (pi ∈ P ). Then it also ouputs the results of the area, estimated total wirelength, congestion, and proba22.

(31) Initial placement. General placement minimize area, wirelength, or congestion. End of placement. Fig. 3.1: Algorithm flowchart of the general placement.. Initial placement. Congestion estimation. Perturb B*-tree. No. Probabilistic RLC noise estimation. Meet the noise constraint or SA cooling enough?. Yes End of placement. Fig. 3.2: Algorithm flowchart of our crosstalk-driven placement. bilistic RLC noise, respectively. Fig. 3.1 illustrates the algorithm flow of the general placement. In the general place-. 23.

(32) ment, it minimizes the area, total wirelength or the congestion by ordinary, and does not take account of crosstalk noise issues. The general placement flow merely can improve the area, timing or the routability. However, the procedure is not enough for today’s deepsubmicron techniques. Without considering the on-chip noise during placement can make the circuit be malfunctioned. The proposed crosstalk-driven placement is based on Simulated-Annealing (SA) algorithm, and B*-tree [16] representation. The algorithm flow is shown in Fig. 3.2. At the initial placement stage, we utilize linear ordering technique [18] to obtain a better initial solution for the placement. After initial placement, the congestion estimation is proceeded. Since the probabilistic model [13] is a two-pin net based structure, each multi-pin net is firstly decomposed into several two-pin nets by the Minimum Spanning Tree (MST) technique. Then for each two-pin net in its shortest path routing region, its horizontally and vertically probabilistic usages are computed. The detail congestion estimation procedure can be referenced to the Chapter 2.3. The probabilistic information is useful to control the overall routing density as well as estimate the probabilistic noise for each two-pin net. While the congestion estimation stage is completed, we perform the probabilistic RLC noise estimation for each two-pin net. The statement of this stage will be illustrated in the next sub-section. After performing the probabilistic RLC noise estimation for each two-pin net, if the overall average probabilistic noise meets the noise constraint or the temperature of SA is cool enough, the placement flow is finished. Otherwise, the placer will iteratively perturb B*-tree to seek a better solution.. 3.2. Probabilistic RLC Noise Estimation. When the congestion estimation stage of our placement is complete, it carries on executing the probabilistic RLC noise estimation. In this section, the contents of the probabilistic noise estimation and how it works are stated. We will interpret the detail work of 24.

(33) Congestion estimation. Congestion estimation. avg_ckt_element calculation Probabilistic noise Peak noise estimation. estimation. Overall_Noiseavg calculation. Meet the noise constraint or SA cooling enough?. Meet the noise constraint or SA cooling enough?. (a). (b). Fig. 3.3: Procedure flow of probabilistic noise estimation. (a) Summarized flow. (b) Detailed flow.. each stage in the probabilistic noise estimation in Chapter 3.2.1 ∼ 3.2.4, and integrate the procedures of the probabilistic noise estimation in Chapter 3.2.4.. 3.2.1. Procedure Flow of Probabilistic Noise Estimation. After the congestion estimation stage, the probabilistic usages of each two-pin net within its routing grids are obtained, we then go to the next stage: probabilistic noise estimation stage. Fig. 3.3 illustrates the summarized and detailed flow of the probabilistic noise estimation. We can see that the probabilistic stage of our crosstalk-driven placement can be subdivided into the avg ckt element calculation, peak noise estimation, and the Overall N oiseavg calculation stage. In the avg ckt element calculation stage of the probabilistic estimation, the unit R, L, C, Cx , Lx of each two-pin net is calculated for the later peak noise estimation. In the peak noise stage, the peak RLC coupling noise is computed by a novel transmission line 25.

(34) based RLC model [11]. We propose an effective approach to pick out the maximum lossy coupling noise from several turning points instead of exhaustively searching. Eventually, the overall average probabilistic noise is computed at the last stage of the probabilistic noise estimation, and our proposed algorithm is integrated.. 3.2.2. Average Circuit Elements Calculation. Since we estimate the net topology without performing layer assignment, the unit average circuit element for each two-pin net n is first calculated for the later RLC peak noise estimation. Equation (3.1) presents the calculation of unit average circuit element for two-pin net n. Here, the ckt element(n)i denotes the unit R, L, C, Cx , Lx of the two-pin net n in layer i, respectively, and they can be computed by [19, 20]. Further, wi is the weight for each layer. It models the probability that two-pin net n will go through the layer i, and we can obtain the values of wi by a trial route or derive them empirically. In general, we set wi = 1/(total number of layers). avg ckt element(n) =. X. wi × ckt element(n)i. (3.1). ∀layers. Assume that the length of each pair of coupled wire is equal. The unit R, L, C, Cx , Lx of each two-pin net can be computed by the following equations. (3.3µΩ − cm) R = l WT Lii 2l (W + T ) = 0.002[ln( ) + 0.5 − 0.2235 × ] l W +sT l s l l2 d2 d Lx = 0.002[ln( + 1 + 2 ) − 1 + 2 + ] l d d l l. (3.2). where the l, W, T denote the wirelength (µm), width (µm), and thickness (µm) of two-pin net i, respectively. Also, d is the distance (µm) between the victim and aggressor. Further, the unit C and Cx between two nets in the middle layers can be written as: C W W T d = ( + ) + 2.04( )0.071 ( )1.773 ox H1 H2 T + 4.5311H1 d + 0.5355H1 d T )0.071 ( )1.773 + 2.04( T + 4.5311H2 d + 0.5355H2 26.

(35) Layer 2 d. W T. Mb. Ma. Mb. Ma. Cx. Cx C. H. C. d. H2. H1. C. Top layer. Layer 1. (a). (b). Fig. 3.4: Cross-view of parallel wires. (a) Cross-section diagram of parallel wires on the top layer. (b) Cross-section diagram of parallel wires in the middle layers. Cx T 2d 2d = 1.4116 exp(− − ) ox d d + 8.014H1 d + 8.014H1 2 W )0.25724 + 1.1852( W + 0.3078d H2 H1 )0.7571 + ( )0.7571 } ·{ ( H1 + 8.961d H2 + 8.961d 2d × exp(− ) d + 3(H1 + H2 ). (3.3). where ox = 3.9 × 8.85 × 10−14 F/cm, and the configurations of H, H1 , H2 are indicated in Fig. 3.4. Similarly, for the case of top layer shown in Fig. 3.4(a), its unit C and Cx can be calculated as following. W d d C = + 2.217( )3.193 + 1.171( )0.7642 ox H d + 0.702H d + 1.51H T · ( )0.1204 T + 4.532H. 27.

(36) Cx T H W = 1.144 ( )0.0944 + 0.7428( )1.144 ox d H + 2.059d W + 1.592d H W )0.1612 · ( )1.179 + 1.158( W + 1.874d H + 0.9801d. 3.2.3. (3.4). Peak Noise Estimation. After the avg ckt element of every two-pin net is determined, its peak RLC noise estimation is beginning. Assume that there are h nets and t available routing tracks in a grid. For a victim net A and aggressor net B shown in Fig. 3.5, since the real routing topology and the length of a net are not known before routing, we assume that each aggressor’s length is equal to the victim’s, and that is the worst case for a victim in its minimum routing region (called mesh(A)). Then the probabilistic noise between net A and B in grid(m,n) can be written as N oiseAB (m, n) =. L PAB (m, n). ×. t−1 X. peak noiseAB (s) × PAB (s). (3.5). s=1. C1t−s PAB (s) = C2t. (3.6). where s is the unit of space between net A and B, and PAB (s) represents the probability of the separated space being s space units between net A and net B. For example, if there are 10 available routing tracks in a grid where net A and B may pass through. Then the probability of 1 routing track separating of net A and B is equal to 0.2 that can be computed by Equation (3.6). Table 3.1 enumerates the separating probability between two nets from 1 to 9 tracks, when there are 10 available routing tracks in a grid. If there are not any shielding wires in grid(m,n), we have to consider the crosstalk noise ranging over the space of t − 1.. L PAB (m, n) is the legal probability of net A and B on grid(m,n). If the grid is not. overflowed, the legal probability of net A and B on grid(m,n) is equal to the probability of net A going through grid(m,n) times the probability of net B going through grid(m,n), that 28.

(37) ●. net A. ● net B. mesh(A) : minimum routing region of net A. Fig. 3.5: The probabilistic noise estimation between victim net A and aggressor net B.. Separating tracks 1 2 3 4 5 6 7 8 9. Separating probability 0.2000 0.1778 0.1556 0.1333 0.1111 0.0889 0.0667 0.0444 0.0222. Table 3.1: Separating probability of two nets in a grid with 10 available routing tracks.. 29.

(38) (Γ= ΓS ) Source. Vi. ●. Tail (Γ= +1). Vi Vi. Vi + (1 + ΓS )V1. ●. ●. 2VO1. ●. 2(VO1 + ΓSVO 2 ). ΓSVi ΓSVi. Vi + (1 + ΓS )(V1 + ΓSV2 ). ●. ΓS2Vi. ……. (. ● 2 VO1. + ΓSVO 2 + ΓS2VO 3. ). Fig. 3.6: Reflection behavior in the near-end (source) and far-end (tail) of the victim wire. is, PA (m, n) × PB (m, n). On the other hand, the legal probability of net A and B should be re-computed by subtracting the redundant probabilistic usages due to the overflow. However, finding the exact legal probability is exhaustive, it suggests us to calculate its upper bound of the coexisting probability which is stated in Chapter 3.3. An other important term in Equation (3.5) is the peak noiseAB (s), which denotes the RLC peak noise between the victim (net A) and aggressor (net B) with separated by s unit space, and can be calculated by Equation (2.22) which is stated in Chapter 2.4. Fig. 2.6 exhibits the first reflection behavior in a transmission line. Actually, the reflective waves in transmission lines are reflected repeatedly. Fig. 3.6 illustrates the whole reflection behavior in the near-end and far-end of a victim wire. The reflective wave of the far-end in a transmission line is [24] Vf ar−end = Voriginal + Vinc + Vref c = Voriginal + Vinc + ΓVinc. (3.7). where Voriginal , Vinc , and Vref c indicate the original wave, incident wave, and the reflective 30.

(39) Index Vi V1 V2 VO1 VO2 VO3. Corresponding ramp waves A2 (t) + A4 (t) A2 (t − 2tf e ) + A4 (t − 2tf o ) A2 (t − 4tf e ) + A4 (t − 4tf o ) A2 (t − tf e ) + A4 (t − tf o ) A2 (t − 3tf e ) + A4 (t − 3tf o ) A2 (t − 5tf e ) + A4 (t − 5tf o ). Table 3.2: Ramp waves in the victim wire shown in Fig. 3.6. wave of the terminations in a transmission line, respectively. Also, Γ denotes the reflection coefficient of the wire (−1 ≤ Γ ≤ 1). Table 3.2 shows each ramp wave exhibiting in Fig. 3.6. From Fig. 3.6, the original incident wave Vi reflects repeatedly between the near-end and far-end of the victim wire. Since the reflection coefficient of the victim’s far-end is equal to +1, the ramp waves shown in Fig. 3.6 can be determined by using Equation (3.7). In view of the wave velocities of the even mode and odd model reflective wave are different, the reflection coefficient of the victim’s near-end should be divided into the even and odd one, respectively. Therefore, the lossless coupling noise of the victim’s far-end can be written as Vvic (t) = 2. ∞ X. {Γkeven A2 (t − (2k + 1)tf e ) + Γkodd A4 (t − (2k + 1)tf o )}. (3.8). k=0. Rv − Z0o2 Rv + Z0o2 Rv − Z0e2 = Rv + Z0e2. Γodd = Γeven. (3.9). where Γodd and Γeven are the victim’s near-end odd mode and even mode reflection coefficient, respectively. In order to obtain the waveform in the victim’s far-end, we sum up the turning points of each reflective waves. Each ramp wave with 2 turning points illustrated in Fig. 3.7, where t0 = tf o t1 = tf e 31.

(40) 2 A4 (t − t fo ). t0 t1 t2. t3. … t11 t. 2 A2 (t − t fe ). …. t. + ……. … Vvic (t ). = t. Fig. 3.7: Sum up turning points of each reflective wave to pick out the peak noise. t2 = tf o + Tr t3 = tf e + Tr t11 = 5tf e + Tr. (3.10). Here, Tr denotes the transition time of the input signal. Be noticeable, the above figure shows the worst-case to find the peak noise, that means, the peak occurs in t = t11 . In general, most peak noise of a net occurs within t1 to t11 . Interestingly, we discover that the peak noise of the all nets always occur within 200ps in our experiment environment (0.13µm / 1.2V, Tr = 100ps, and two metal layers for routing). Generally, it should sample 11 turning points to determine the peak value within the noise window, that is t = {ntf o | n = 3, 5} t = {ntf e | n = 1, 3, 5} 32.

(41) t = {ntf o + Tr | n = 1, 3, 5} t = {ntf e + Tr | n = 1, 3, 5}. (3.11). The sampling points shown in Equation (3.11) that neglect t = tf o , since this node is a starting point of Vvic (t), and always be equal to 0V . However, sampling 11 points to pick out the peak noise in the victim’s far-end is too costly and will degrade the speed of placement. For the specific wirelength of our experiment environment, we discover that sampling less than 11 points is enough to determine the peak noise. For example, if the length of a net is longer than 4000µm, only sampling the first 3 turning points (tf o + Tr , tf e , and tf e + T r) can pick out the peak noise. However, if the experiment environment or the design technology is changed, we should construct a table to record the relationship between the wirelength and the number of points for sampling, then obtain correct peak values during RLC estimation procedure. Developing the length property during placement can extremely reduce the computation complexity of probabilistic noise estimation, and speed up the overall placement procedure. Generally, for a shorter net, whose length is shorter than 4000 µm, it must sample more turning points to pick out the peak noise. This is due to the shorter wirelength, the smaller tf e and tf o , and there are more ramps will be occured within the noise window. Therefore, we should sample more turning points for a shorter net more than the longer ones. While the lossless peak noise of a net is determined by Equation (3.8), it can be transferred to the lossy peak noise by Equation (2.22) stated in Chapter 2.4, and the total probabilistic noise of net A in grid(m,n) can be written as: N oiseA (m, n) =. X. N oiseAK (m, n). (3.12). K∈Ω(m,n). where Ω(m, n) is a set that nets may pass through grid(m,n). Finally, the average probabilistic noise of net A in its mesh can be computed as following: Avg N oiseA =. N oiseA (m, n) # grids in mesh(A) (m,n)∈mesh(A) X. 33. (3.13).

(42) 3.2.4. Overall Average Probabilistic Noise Estimation. When the average probabilistic noise of each two-pin net is acquired, the next stage is to compute the overall average probabilistic noise of a placement, and it can be calculated by the following equation: P. Overall N oiseavg =. ∀two−pin nets. Avg N oiseK (3.14). # two-pin nets. Table 3.3 exhibits the whole algorithm of probabilistic RLC noise estimation. For each two-pin net i, its average circuit element is computed by the referenced equations stated in Chapter 3.2.2, and we initialize the sum of probabilistic noise for net i to be zero. Next, for each two-pin net j (j 6= i) may pass through the grids of mesh(i), its peak noise interferences with net i is computed by using Equation (3.5). After all of the probabilistic noise within mesh(i) is calculated, we estimate the Avg N oisei by Equation (3.13). At last, when the probabilistic noise of each two-pin net is obtained, the Overall N oiseavg is computed by means of Equation (3.14), and the stage of the probabilistic noise estimation is finished. In practice, when the length of a two-pin net is too short, that means,. l w. < 10 [21],. the Grover formulae used for our on-chip inductance estimation will make a large error. Therefore, we let the probabilistic RLC noise of the victim be equal to zero if the length of the aggressor or victim is shorter than 5µm. This assumption is reasonable due to if the length of a net is too short, it will suffer tiny coupling noise and interfere other nets restrictedly when it is an aggressor. Further, the probabilistic usages of a two-pin net in its routing grids that divides into the horizontal and vertical usages, its coupling noise for the horizontal and vertical direction can be computed by Equation (3.13), respectively.. 3.3. Upper Bound of the Coexisting Probability. Assume that there are h nets, t routing tracks in grid(m,n), and h > t. Since the arithmetic mean of a set of values is larger or equal to its geometric mean. Utilize the property, the 34.

(43) Algorithm of Probabilistic RLC Noise Estimation Input: Probabilistic usages of each two-pin net in its grids Output: Average probabilistic RLC noise of each two-pin net 1 Begin 2 For each net i in the design 3 Compute avg ckt element(i) 4 set N oisei = 0 5 For each grid(m,n) within mesh(i) 6 For each net j within grid(m,n) /*j 6= i*/ 7 If lengthi or lengthj ≤ 5µm Then N oiseij (m, n) ← 0 8 N oisei + = N oiseij (m, n) 9 EndFor 10 EndFor 11 Avg N oisei = N oisei /(# of grids within mesh(i)) 12 EndFor 13 Compute the Overall N oiseavg 14 End. Table 3.3: Algorithm of probabilistic RLC noise estimation. I lower bound probability of the total illegal terms of net A and B, PAB (m, n), which means. more than t nets coexist in this grid, can be written as x X. I PAB (m, n) j. v uY u x I x (m, n) PAB ≥ x· t j j=1. j=1. =. v u x· u t x. Y. (PKr × P¯Ks ). (3.15). K∈Ω(m,n) K6=A,B. I where Ω(m, n) indicates the set that nets may pass through grid(m,n). PAB (m, n) reprej. sents the jth illegal term of net A and B in grid(m,n), and x denotes the number of terms I of PAB (m, n), which can be computed as following. # terms of. L PAB (m, n). =. t−2 X. Cit−2. i=2 I # terms of PAB (m, n) = x = 2t−2 L − (# terms of PAB (m, n)). (3.16). L similarly, PAB (m, n) denotes the number of terms of the legal probability of net A and. B in grid(m,n). Also, the PK and P¯K indicate the probability that net K goes and does 35.

(44) not go through this grid, and r as well as s denote the number of terms that net K goes and does not go through grid(m,n), respectively. They can be computed by the following equation: X 1 h−t−2 h−2 [(t + p)Ct+p ] h − 2 p=−1. r = s =. h−t−2 X. [(1 −. p=−1. t + p h−2 )C ] h − 2 t+p. (3.17). Consequently, the upper bound of the legal probability of net A and B in grid(m,n) can be written as: L PAB (m, n) = PA (m, n) × PB (m, n) × [1 −. x X. I (m, n)] PAB j. (3.18). j=1. In order to further clarify the equations of the upper bound of the coexisting probability, we give an example as follows. Assume that there are 3 available routing tracks in grid(m,n), and 5 nets (net A, B, C, D, and E) may go through the grid. Then the overf low = 5 − 3 = 2, PAB (m, n) is consisted of ¯ E] ¯ + P [AB C¯ DE] ¯ + P [AB CD ¯ E] ¯ + P [ABC D ¯ E] ¯ PAB (m, n) = P [AB C¯ D ¯ ¯ + P [ABCDE] ¯ + P [AB CDE] + P [ABC DE] + P [ABCDE]. (3.19). where the legal terms are the first four ones, and the amounts of them can be computed by using Equation (3.16) L # terms of PAB (m, n) = C23 + C33 = 4 I L # terms of PAB (m, n) = 2#tracks−2 − (# terms of PAB (m, n)). = 23 − 4 = 4. (3.20). In Equation (3.19), we can see that the number of illegal terms of C (or D, E) are 3, ¯ E) ¯ is 1. Their computations are corresponding to and number of illegal term of C¯ (or D, the coefficient r and s stated in Equation (3.17). 36.

(45) While the coefficient r and s are obtained, the illegal probability of net A and B can be calculated by Equation (3.15): 4 X j=1. v u 4 v uY u I 4 I t PABj (m, n) ≥ 4 · PABj (m, n) = 4 · u t 4 j=1. = 4·. Y. (PKr × P¯Ks ). K∈Ω(m,n) K6=A,B. q 4. ¯ (D)P ¯ (E) ¯ (3.21) (P (C))3 (P (D))3 (P (E))3 · P (C)P. Finally, we add the the result acquired from Equation (3.21) into the Equation (3.18), and the upper bound of the coexisting probability of net A and B can be obtained.. 3.4. Partial Estimation. In general, utilizing the algorithm stated in Fig. 3.2 and Table 3.3 is capable to accomplish the crosstalk-driven placement successfully. However, since each B*-tree perturbation in SA algorithm only selects 1 or 2 blocks to move, flip, or rotate. The above procedure costs too much runtime to analyze many redundant nets. That is, we should only analyze the nets whose belonging blocks are moved after performing last B*-tree perturbation. Also, we should re-analyze the nets whose belonging blocks are shifted due to their neighboring blocks are moved. We call this procedure to be “partial analysis(estimation)”, and the flow stated in Table 2.1 and Table 3.3 are called “complete analysis(estimation)”. In the following sub-sections, we will introduce the partial estimation of the congestion and probabilistic noise, respectively. Then finally, the algorithm and several properties will be given to show how these procedures work.. 3.4.1. Partial Congestion Estimation. After perturbing a B*-tree, the block selected in this perturbation are recorded. Since we not only re-analyze the nets belonging to the selected blocks but also need to care about the blocks which is shifted due to their neighbors, we record the placement order of each block after a perturbation. Fig. 3.8 illustrates the procedure: Assume that there are two nodes, n4 and n5 selected for perturbing, and resulting in a B*-tree shown in Fig. 37.

(46) Placement order. n0. Front. b1. n2. n1. b0 b3. n3. n5. b4. n6. b2 b5. n7. n4. Must be rechecked after b3. b6 Back. b7. selected nodes for perturbing. (b). (a). Fig. 3.8: Re-check the blocks whose placement orders after that of the selected nodes. (a) Node n4 and n5 are selected during a B*-tree perturbation. (b) The corresponding placement order.. 3.8(a). Then we record its corresponding placement order exhibiting in Fig. 3.8(b). If the placement order of a block is after b3 , its new placement location should be compared to its previous. If it is moved, the probabilistic noise and usages of the nets that belongs to the block have to be re-analyzed. This is because the blocks whose placement orders before b4 are placed earlier, and their new locations will be the same as their previous. Making the record of placement orders to determine which blocks should be reanalyzed is more efficiently than the exhaustive location comparison, and can save much more runtime in the partial estimation procedure. After determining which blocks’ locations are shifted, the nets belonging to these blocks begin to re-analyze. Fig. 3.9 illustrates the procedure of the partial congestion estimation. Assume that the original net A (called old net A) exists in the old mesh(A) where is the yellow region. If it is moved to the top-left corner of the chip after a perturbation, we should compute its new usage and add it into every grid within the new mesh(A), and update the congestion. Also, the old existence of the old net A within its original mesh has. 38.

(47) Green region: new_mesh(A). Must newly add the existence of new_net A Blue region: mark-grid. ●. ● ●. Yellow region: old_mesh(A). ●. old_net A. new_net A Must erase the old existence of old_net A. Fig. 3.9: Illustration of the partial congestion estimation procedure. to be erased and updated. Furthermore, if there are mark-grids between the old mesh(A) and new mesh(A) (mark grid(A) ∈ {old mesh(A) ∩ new mesh(A)}), we only need to re-compute the congestion within these grids and without updating the existence of net A.. Table 3.4 shows the algorithm of the partial congestion estimation. First, the approach illustrated in Fig. 3.8 is employed to determine which nets should be re-analyzed, we call them the changed-nets herein. For each changed-net, in addition to perform the congestion estimation similar to before, we update its existence and probabilistic usages within each grid of its old mesh as well as new mesh. Finally, we compute the congestion for each grid and finish the partial congestion estimation.. 3.4.2. Partial Probabilistic Noise Estimation. The concept of the partial noise estimation is similar to that of the partial congestion estimation but more complicated. We use the configuration shown in Fig. 3.9 again to interpret the procedure of the partial noise estimation: When net A is moved to the 39.

(48) Algorithm of Partial Congestion Estimation Input: Netlist of the placement Output: Probabilistic usages of each two-pin net; Congestion of each grid 1 Begin 2 For each changed-net 3 MST(net) 4 For each segment of the MST 5 Determine the size of new mesh 6 Erase the old usage within its old mesh 7 Erase the old existence within its old mesh excluding mark grids 8 Compute the horizontal and vertical usages within its new mesh 9 Update the new existence within its new mesh 10 EndFor 11 EndFor 12 For each grid in the design 13 Compute the congestion of the grid 14 EndFor 15 End. Table 3.4: Algorithm of the partial congestion estimation. new mesh(A) where is the green region, the probabilistic noise of the nets which relate to net A within old mesh(A) should be updated firstly. That means, if there is a net (called net B) where in the old mesh(A) is not shifted, its probabilistic noise has to subtract the interference due to net A. Also, the probabilistic noise of net A should be updated by subtracting the interference resulting from net B. After the update of old mesh(A) is complete, we carry on the noise update in new mesh(A). Similar to complete noise estimation stated in Table 3.3, if there is a net (called net C) within the new mesh(A), and it does not exist in the old mesh(A), it needs to add the interference due to net A. Similarly, net A also adds the interference resulting from net C. In brief, we can summarize the properties of the noise update as follows: • The net which is a changed-net (called net A): It needs to subtract the interference due to the related nets where in the old mesh(A), and re-compute the new proba-. 40.

(49) bilistic noise due to the nets where in the new mesh(A). • The net exists in old mesh(A) “originally” (means that it is not a changed-net), and does not belong to new mesh(A): Only subtracting the interference due to net A. • The net whose minimum routing region covers the old mesh(A) and new mesh(A): Subtracting the old interference due to net A, then adding the new interference results from the net A. Modifying the step 2 ∼ 4 exhibiting in Table 3.3 by the above properties can determine the manner of the noise update for each net, and accomplish the partial noise estimation. Because the coupling noise results from the mutual inductance costs considerable efforts to be handled, performing the partial noise estimation is necessary and can improve much runtime during placement.. 3.5. Algorithm Flow of Crosstalk-Driven Placement. In this section, the whole procedures are integrated and the overall algorithm of our crosstalk-driven placement will be given. Table 3.5 illustrates the full algorithm of our crosstalk-driven placement. In the first place, we set perturb flag to be false and all the nets to belong to the changed-net, that means, it performs once complete estimation for the congestion and probabilistic RLC noise, which is stated in Table 2.1 and Table 3.3, respectively. After executing the complete estimation for each two-pin net, the outcome performance of the placement is judged in the line 8. If it meets the noise constraint or the temperature of SA is cool enough, then exit the placement procedure. Otherwise, the B*-tree perturbation is performed to seek a better solution for placement, and set the perturb flag to be true. Also, the members of the changed-net are updated by the manner stated in Chapter 3.4.1. While the perturbation is complete, we execute the partial congestion estimation and 41.

(50) Algorithm of Crosstalk-Driven Placement Input: Netlist of the design Output: Area, total estimated wirelength, and probabilistic RLC noise of the placement 1 Begin 2 Set perturb flag ← false 3 Set {changed-net} ← {all the nets} 4 For each net of the {changed-net} 5 If perturb flag = false Perform the complete congestion estimation ; Perform the complete probabilistic RLC noise estimation ; 6 Else Perform the partial congestion estimation ; Perform the partial probabilistic RLC noise estimation ; 7 EndFor 8 If meet the noise constraint or SA cooling enough Exit placement ; 9 Else Perturb B*-tree ; Set perturb flag ← true ; Update the {changed-net} ; Goto line 4 ; 10 End. Table 3.5: Algorithm of the proposed crosstalk-driven placement. partial probabilistic RLC noise estimation for each changed-net, which are illustrated in Table 3.4 and Chapter 3.4.2, respectively. The above procedures are iteratively proceed until one of the conditions is satisfying, then exit the placement.. 42.

(51) Chapter 4 Experimental Results In order to check the validity of our proposed placement, we test our method on MCNC benchmarks and two additional cases, ckt529 and ckt1681. The number of cells and nets of each test case is shown in Table 4.1. The proposed placement is implemented by C++ language, and run on a Pentium IV 3.2 GHz with 2GB memory. We compare the results with the area-driven, congestion-driven, RC-driven, and RLCdriven placement. In the area-driven placement, it only minimizes the placement area. In the congestion-driven placement, we minimize the area, total wirelength, number of overflowing grids, and overall routing density. In the others, they minimize the area, total wirelength, the overall routing density, and penalize the overflow to prevent from congestion, then use the proposed algorithm to minimize the crosstalk noise. The difference of RC-driven and RLC-driven placement is, the former only takes R, L, C, Cx of wires into account, but the latter one extra considers the mutual inductance. After each placement Benchmark apte hp xerox ami33 ami49 ckt529 ckt1681. # cells 9 11 10 33 49 529 1681. # nets 97 83 203 123 408 613 1991. Table 4.1: Number of cells and nets of MCNC and our benchmarks.. 43.

(52) Placement flow of area-driven placement 1 Parsering the input netlist 2 Minimize the total area /*cost = α(Area)*/ 3 Probabilistic RLC noise verification 4 Maximum routing density analysis Table 4.2: Placement flow of the area-driven placement. is complete, its overall average probabilistic RLC noise is verified by eq.(3.14). Here, we calculate the true legal coexisting probability instead of the upper bound of legal coexisting probability of two nets to verify the real probabilistic noise of a design. All of the testcases utilize 0.13µm / 1.2V technology and two metal layers for congestion estimation. Furthermore, we set the input signal rising time Tr = 100ps and the resistance of each pin is around 30 to 120Ω which is proportional to its cell area, and a shielding wire is inserted between each 10 wires. Hence, the crosstalk noise is considered ranging over 10 units of space. The experimental results are shown in Table 4.6 ∼ 4.8, where WL, max H, and max V, indicate the total wirelength estimated by the halfperimeter wirelength technique, the maximum estimated horizontal and vertical routing density of the overall global routing grids, respectively. In addition, the P RC, P RLC, and Peak RLC noise denote the overall average probabilistic RC, RLC noise, and the peak probabilistic RLC noise, respectively. We set up the environment for each placer to compare their experimental results as following, and the sum of cost coefficient for each cost function is equal to 1. • Area-driven placement: The placement flow of area-driven placement is exhibited in Table 4.2. For the results of area-driven placement shown in Table 4.6, it only minimizes the total area (cost coefficient α = 1), then it obtains the minimum placement area but sacrifices the total wirelength, routability, and crosstalk immunity. It even may be unroutable in several benchmarks if the maximum estimated density is larger than 1. 44.

(53) Placement flow of congestion-driven placement 1 Parsering the input netlist 2 Minimize the cost fuction /*cost = α(Area) + β(W L) + γ(avg congestion) + δ(number of overflowing grids)*/ 3 Probabilistic RLC noise verification 4 Maximum routing density analysis Table 4.3: Placement flow of the congestion-driven placement. Placement flow of RC-driven placement 1 Parsering the input netlist 2 Minimize the cost fuction /*cost = α(Area) + β(W L) + γ(avg congestion) + δ(P RC)*/ 3 Probabilistic RLC noise verification 4 Maximum routing density analysis Table 4.4: Placement flow of the RC-driven placement. • Congestion-driven placement: The placement flow of congestion-driven placement is indicated in Table 4.3. In the congestion-driven placement, it simultaneously minimizes the total area, wirelength, average congestion, and the number of overflowing grids, respectively. That means, it sets α + β + γ + δ = 1. We instinctively figure that minimizing the congestion is equivalent to mitigate the coupling effects between interconnects, and the overall crosstalk noise can be controlled. However, it is negated in the experimental results of congestion-driven placement shown in Table 4.7. It is because the coupling noise is not only dominated by Cx and Lx but also by the pin resistances. If the net in a higher coupling region but with a stronger driver, it will have stronger noise immunity and suffer smaller noise. That is the reason that only minimizing the congestion cannot achieve the best noise immunity. • RC-driven placement:. 45.

(54) Placement flow of RLC-driven placement 1 Parsering the input netlist 2 Minimize the cost fuction /*cost = α(Area) + β(W L) + γ(avg congestion) + δ(P RLC)*/ 3 Maximum routing density analysis Table 4.5: Placement flow of the RLC-driven placement. The placement flow of RC-driven placement is exhibited in Table 4.4. For the RCdriven placement, it simultaneously minimizes the total area, wirelength, average congestion, and the probabilistic RC noise (P RC), respectively. That means, it sets α + β + δ + γ = 1. After placement is complete, we verify its probabilistic RLC noise again to check the signal integrity. The results shown in Table 4.8 reveals that the RC-driven placement has smaller area, higher congestion but worse noise immunity on average than the RLC-driven placement. We speculate that it is due to without the mutual inductance consideration, the probabilistic RC noise may be still slight enough in the higher congested region. Moreover, for the comparison of P RC and P RLC indicated in Table 4.8, the RC model indeed underestimates the crosstalk noise about 3X against that of RLC model in our testcases. • RLC-driven placement: The placement flow of area-driven placement is exhibited in Table 4.5. Similar to RC-driven placement, the difference between the RC and RLC-driven placement is: RLC-driven considers the probabilistic RLC noise (P RLC) instead of P RC. In RLC-driven placement, it achieves the best performance in the total wirelength and crosstalk noise against the other placers. Because considering the effect of the mutual inductance, its placement area is a little larger than that of the others. From the experimental results shown in Table 4.9, RLC-driven placement averagely improves 8.9%, and 15.9% in the probabilistic RLC noise than that of the RC46.