Chapter 3 EXTENSIONS
B. Global Routing
To include our Steiner tree construction to global routing, we shall consider the capacity of each edge on the global routing graph. Without loss of generality, assume the net ordering is given. It can be seen that on the 3D escape graph, if the capacity of some edge is full, then this edge can be set as forbidden; otherwise, this edge can still be used. After the RSMT is constructed, the capacity of the corresponding routed edges reduces. Moreover, considering the grids on upper metal layers are larger than lower ones, we may slightly shift the lines of the3D escape graph to align their nearest grids.
26
Chapter 4
EXPERIMENTAL RESULTS
We implemented our algorithm in C++ language and executed the program on a PC with an Intel Pentium4 3.0 GHz CPU and 1 GB memory under Windows XP OS.
Our results show our algorithm outperforms state-of-the-art works on SL-OARSMT, ML-OARSMT, and OAPDST. Meanwhile, our runtimes are also stable, not increasing much from SL-OARSMT to ML-OARSMT and OAPDST. In addition, the comparison between DT without and with obstacles is provided. Furthermore, the results of timing-driven RSMT are also provided.
TABLE II
SL-OARSMT: THE COMPARISONS ON THE TOTAL WIRELENGTH BETWEEN PREVIOUS WORK [6-10] AND OURS
1HPBB: The half-perimeter of the bounding box of all pin-vertices (which is a lower bound of total wirelength), and “-” refers to “not available.”
2Ours_SL: Our algorithm for SL-OARSMT.
3Nref: All steps of our algorithm are applied, but 3D U-shaped pattern refinement is turned off.
27
4Full: All steps of our algorithm are applied.
5Mst: Step 1 (DT) and step 2 (MST) of our algorithm are applied.
6,7Avg. (%): Average improvement is computed by averaging (X-G)/X, and (X-G)/(X-Y) for all cases, X = A, B, C, D, E, F.
A. SL-OARSMT
For SL-OARSMT, totally 14 benchmark circuits were provided by [8]; the first 3 from industry, the rest from [6]. We compared our algorithm with those presented in [6], [7], [8], [9], [10]. The results of [6] and [8] are quoted from their papers; those of [7] are quoted from [8]; those of [9] and [10] were conducted on our platform using their binary codes. (In addition, the parameter α was set to [0.50, 1.30] depending on the congestion.) As listed in TABLE II, considering the differences from the half-perimeter of the bounding box of all pins (which is a lower bound of the optimal solution), our algorithm achieved average 5.03% up to 26.88% improvement on wirelength over them. Moreover, we had the best results for 12 out of 14 cases. Fig.
13 shows the resulting SL-OARSMTs of sl-rc6 and sl-rc9. Without refinement, on average, we still have a small win to [8] and [9] on total wirelength. Novel 3D U-shaped pattern refinement worked well in planar cases and contributed 2.76%
reduction on wirelength. Because our method mainly focuses on multi-layer, the overhead on runtimes for single-layer is reasonable.
B. ML-OARSMT
For ML-OARSMT, totally 10 test cases were provided by [11]. ml-ind4 and ml-ind5 simulate the environment for single-layer routing, where all pins and obstacles are located in a layer, and the upper and lower adjacent layers are entirely occupied by another two large obstacles. Fig. 14 displays the ML-OARSMT of ml-ind2 generated
28
by our algorithm as Cv = 3.
We compared our algorithm with [11]. (In addition, the parameter _ was set to [0.70, 1.15].) As listed in TABLE III (IV), as Cv = 3 (5), the average improvements on the number of vias and total costs are 7.69% (4.76%), 2.77% (2.74%), respectively.
Our algorithm has smaller total costs in 9 out of 10 cases. In addition, our algorithm always generated a smaller total cost as Cv = 3 than that as Cv = 5 for each case; it can be seen that our algorithm is indeed stable.
TABLE III
ML-OARSMT: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COSTS, AND CPU TIMES BETWEEN [11] AND OURS UNDER CV = 3
1The runtimes of [11] are quoted from the paper, generated by a 2.8GHz AMD-64 machine with 8GB
memory under Ubuntu 6.06 OS. They are listed for reference because the machine is different.
2[11]_ref: 3D U-shaped pattern refinement is directly applied to the resulting ML-OARSMT of [11].
The runtimes of [11]_ref only count for refinement and are measured on our platform.
3Ours_ML: Our algorithm for ML-OARSMT. 3Imp. (%): Average improvement is computed by averaging (A-X)/A for all cases, X =B, C or D.
29
TABLE IV
ML-OARSMT: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COSTS, AND CPU TIMES BETWEEN [11] AND OURS UNDER CV = 5
1The runtimes of [11] are quoted from the paper, generated by a 2.8GHz AMD-64 machine with 8GB
memory under Ubuntu 6.06 OS. They are listed for reference because the machine is different.
2[11]_ref: 3D U-shaped pattern refinement is directly applied to the resulting ML-OARSMT of [11].
The runtimes of [11]_ref only count for refinement and are measured on our platform.
3Ours_ML: Our algorithm for ML-OARSMT. 3Imp. (%): Average improvement is computed by averaging (A-X)/A for all cases, X = B, C or D.
C. OAPDST
For OAPDST, totally 10 test cases are used. 7 cases are exactly the same as those used in ML-OARSMT. We did not use ml-ind3 because it is invalid under the PD constraints. pd-ind4, pd-ind5a, pd-ind5b were modified from ml-ind4 and ml-ind5.
For pd-ind4 and pd-ind5a, we inserted one empty layer right above the working layer.
For pd-ind5b, we further duplicated the obstacles in the working layer onto the inserted layer. In addition, the routing cost UCi was set to 1 for all i. By doing so, we can see how worse an OAPDST can be with respect to its ML-OARSMT counterpart.
(In addition, the parameter α was set to [0.70, 1.00].) Fig. 15 displays the OAPDST of ml-ind2 generated by our algorithm as Cv = 3.
We compared our algorithm with [13]; because we cannot obtain the test cases
30
and the program of [13], we implemented their algorithm and executed it on the same machine described above.
As listed in TABLE V, as Cv = 3, the average degradation of the total costs from ML-OARSMT to OAPDST is 6.47%, but the average speedup of CPU times is 60.18%. The average improvement of the total costs over [13] is 3.20%, while the CPU times are almost the same. Our algorithm has smaller total costs in 9 out of 10 cases.
TABLE VI compares the impacts of the obstacle-weighted MST and 3D U-shaped pattern refinement of our algorithm. It can be seen that without the guidance from the MST, on average, we may have an 8.76% degradation on the total costs, and the CPU times surprisingly become much worse (38.88% slower). Hence, steps 1 and 2 are necessary; actually, they are efficient and effective. On the other hand, although 3D U-shaped pattern refinement does not influence much on our results, it does improve the total costs of [13] by 2.66% on average. The refined results of [13] are still slightly worse than ours when refinement is turned off. Although not presented here, we have similar results for Cv = 5, and UCi ≠ 1.
31
TABLE V
OAPDST: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COST, AND CPU TIMES BETWEEN [13] AND OURS UNDER CV = 3, UCI = 1,
1≦I≦NL
1Ours_ML: Our algorithm is applied without the PD constraints; it can be viewed as the lower bound of
the total cost for OAPDST.
2Ours_PD: All steps of our algorithm for OAPDST are applied. 3Imp. (%): Average improvement is computed by averaging (X-B)/X for all cases, where X = A, C.
TABLE VI
OAPDST: THE COMPARISONS ON THE IMPACTS OF OUR ALGORITHM ON THE TOTAL COST AND CPU TIMES UNDER CV = 3, UCI = 1, 1≦I≦NL
1Nmst: Only step 3 of our algorithm is applied, i.e., the tree is directly constructed from the 3D
extended escape graph.
2Nref: All steps of our algorithm are applied, but 3D U-shaped pattern refinement is turned off.
3[13]_ref: 3D U-shaped pattern refinement is applied to [13].
4Imp. (%): Average improvement is computed by averaging (X-B)/X for all cases, where B is Ours_PD
32 in TABLE IV, X = D, E, F.
Fig. 13.(a) The SL-OARSMT of sl-rc6. (b) The SL-OARSMT of sl-rc9.
33
Fig. 14.The ML-OARSMT of ml-ind2 under Cv = 3. (a) The DT without illegal edges.
(b) The MST. (c)-(g) Layers 2-6, respectively. (h) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.
34
Fig. 15.The OAPDST of ml-ind2 under Cv = 3. (a) The DT with illegal edges. (b) The MST. (c)-(g) Layers 2-6, respectively. The odd (even) layers allow vertical (horizontal) edges. Some line segments are at obstacle boundaries; they are feasible according to the problem formulation. (h) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.
35
Chapter 5 CONCLUSION
In this thesis, we solved ML-OARSMT and OAPDST by the same strategy. In addition, we also showed our method can be extended to construct timing-driven RSMTs. Previous work tackles one configuration at a time, while our algorithm can easily handle various configurations. Experimental results showed that our algorithm outperformed the state-of-the-art works. Future work includes the extensions to clock trees and manufacturability-aware trees.
36
REFERENCES
[1] The International Technology Roadmap for Semiconductors (ITRS), 2007.
Available: http://www.itrs.net/
[2] M. R. Garey and D. S. Johnson, “The rectilinear Steiner tree problem is NP-complete,” SIAM J. Appl. Math., vol. 32, no. 4, pp. 826-834, 1977.
[3] J. L. Ganley and J. P. Cohoon, “Routing a multi-terminal critical net: Steiner tree construction in the presence of obstacles,” in Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS’94), vol. 1, May 1994, pp.113-116.
[4] M. de Berg, O. Cheong, M. van Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications, 3rd ed., Springer-Verlag, 2008.
[5] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed., MIT Press, 2001.
[6] Z. Feng, Y. Hu, T. Jing, X. Hong, X. Hu, and G. Yan, “An O(nlogn) algorithm for obstacle-avoiding routing tree construction in the lambda geometry plane,” in Proc. ACM Int. Symp. on Physical Design (ISPD’06), Apr. 2006, pp. 48-55.
[7] Z. Shen, C. C. N. Chu, and Y.-M. Li, “Efficient rectilinear Steiner tree construction with rectilinear blockages,” in Proc. IEEE Int. Conf. on Computer Design (ICCD’05), Oct. 2005, pp. 38-44.
[8] P.-C. Wu, J.-R. Gao, and T.-C. Wang, “A fast and stable algorithm for obstacle-avoiding rectilinear Steiner minimal tree construction,” in Proc.
ACM/IEEE Asia and South Pacific Design Automation Conf. (ASP-DAC’07), Jan.
2007, pp. 262-267.
[9] C.-W. Lin, S.-Y. Chen, C.-F. Li, Y.-W. Chang, and C.-L. Yang,
37
“Obstacle-avoiding rectilinear Steiner tree construction based on spanning graphs,” IEEE Trans. Computer-Aided Design, vol. 27, no. 4, pp.643-653, Apr.
2008. Also see Proc. ACM Int. Symp. on Physical Design (ISPD’07), pp.127-134.
[10] J. Long, H. Zhou, and S. O. Memik, “An O(nlogn) edge-based algorithm for obstacle-avoiding rectilinear Steiner tree construction,” in Proc. ACM Int. Symp.
on Physical Design (ISPD’08), Apr. 2008, pp. 126-133
[11] C.-W. Lin, S.-L. Huang, K.-C. Hsu, M.-X. Lee, and Y.-W. Chang, “Multilayer obstacle-avoiding rectilinear Steiner tree construction based on spanning graphs,”
IEEE Trans. Computer-Aided Design, vol. 27, no.11, pp. 2007-2016, Nov. 2008.
Also see Proc. IEEE/ACM Int. Conf. on Computer-aided Design (ICCAD’07), pp.380-385.
[12] M. C. Yildiz and P. H. Madden, “Preferred direction Steiner trees,” IEEE Trans.
Computer-Aided Design, vol. 21, no. 11, pp. 1368-1372, Nov.2002.
[13] C.-H. Liu, Y.-H. Chou, S.-Y. Yuan, and S.-Y. Kuo, “Efficient multilayer routing based on obstacle-avoiding preferred direction Steiner tree,” in Proc. ACM Int.
Symp. on Physical Design (ISPD’08), Apr. 2008, pp.118-125.
[14] I. H.-R. Jiang, S.-W. Lin, and Y.-T. Yu, “Unification of obstacle-avoiding rectilinear Steiner tree construction,” in Proc. IEEE Int. SOC Conf. (SOCC’08), Sep. 2008.
[15] I. H.-R. Jiang and Y.-T. Yu, “Configurable rectilinear Steiner tree construction for SoC and nano technologies,” in Proc. IEEE Int. Conf. on Computer Design (ICCD’08), Oct. 2008, pp. 34-39.
38
APPENDIX
Timing-Driven Steiner Trees
In addition to preferred directions, our method can also be applied to other types of Steiner trees. For example, a timing-driven RSMT targets to minimize the path length between each pin to the designated source pin. It can be done by constructing a shortest-path tree instead of MST at step 2. The shortest-path tree is constructed by Dijkstra’s shortest path algorithm. Moreover, if step 3 still remains the same, we may obtain a compromise between the path length and the total cost.
For timing-driven RSMT, the source pin is randomly assigned in our experiments.
TABLE VII lists the results for timing-driven RSMT. Fig. 16 displays the timing-driven OAPDST of ml-ind2 generated by our algorithm as Cv = 5.
TABLE VII
TIMING-DRIVEN RSMT
1Original: The results of our algorithms to minimize total cost are quoted from TABLE II, III, and V.
2Timing-Driven: To minimize the delay from source to each sink, the minimum spanning tree is replaced with shortest path tree at step 2. The rest of our algorithm is unchanged.
39
Fig. 16.The timing driven OAPDST of ml-ind2 under Cv = 5. DT is the same as Fig.
15 (a). (a) The SPT. (b)-(f) Layers 2-6, respectively. The even (odd) layers allow vertical (horizontal) edges. Some line segments are at obstacle boundaries; they are feasible according to the problem formulation. (g) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.