Chapter 2 PROBLEM FORMULATION AND ALGORITHM
D. Time Complexity Analysis
Let n=m+4k for an instance with m pins and k obstacles. Step 1 takes O(mlgm) time for DT construction [4]; step 2 takes O(m(lgm)2) time for Kruskal’s algorithm;
step 3 takes O(n3) time for the 3D extended escape graph construction, Dijkstra’s algorithm, 3D U-shaped pattern refinement. As mentioned in Section III.B, steps 1 and 2 can effectively guide step 3, and they have low time complexities, so they are worthwhile. Although 3D U-shaped pattern refinement in step 3 has a high time
20
complexity, it can be expected to produce good solutions.
Compared with [11], steps 1 and 2 of our algorithm have relatively low time complexities, and step 3 has the same order complexity. Since the time complexities are the same, it would be a good decision to take time on sophisticated refinement.
21
Fig. 10.(a) An instance of OAPDST, where Cv = 3, each grid size is 20×20 (unit)2, UCi =1 for all layers. (b)(c) The corresponding DT, obstacle-weighted MST, and 3D extended escape graph. (d) The resulting OAPDST without refinement. (e) The resulting OAPDST with refinement.
22
Chapter 3 EXTENSIONS
A. Preferred Directions
In this section, we shall demonstrate the flexibility of our algorithm. As shown in Fig. 2, our algorithm for ML-OARSMT can easily be extended to consider preferred directions. We adopt the formulation of the obstacle-avoiding preferred direction Steiner tree (OAPDST) problem in [13].
Problem: Obstacle-Avoiding Preferred Direction Steiner Tree (OAPDST): Given the equivalent wirelength cost Cv of a via, the number Nl of layers, a set P = {p1, p2, …, pm} of pins, a set O = {o1, o2, …, ok} of obstacles, the layer-specific routing cost UCi, 1≦i≦Nl, the PD constraints, construct a Steiner tree to connect all pins in P, such that no tree edge or via intersects any obstacle in O and the total cost of the
tree is minimized.
The definitions and restrictions of an obstacle, a pin-vertex, a via are the same as those in Section II. Here, a routing layer I has a specific routing cost UCi, the unit cost of wires in layer i. Without loss of generality, assume the PD constraints as follows:
the odd (even) layers only allow vertical (horizontal) edges [12, 13].
To adapt our algorithm for ML-OARSMT to OAPDST, we apply simple and effective modifications to the DT, the 3D extended escape graph, and 3D U-shaped pattern refinement.
1) The DT: For each edge, the part of edge weight contributed by the Manhattan distance is multiplied by UCi, and α is changed to be a function of obstacles and UCi. For the edge between pi and pj (located at layers zi and zj), the UCi for vertical (horizontal) segments is the minimum value among vertical (horizontal) layers from
23
layer min(zi, zj)-1 to layer max(zi, zj)+1.
Fig. 11.(a) Our ML-OARSMT. (b) OAPDST in [13]. (c) The refined tree of (b).
24
(a) degenerated basic case: I + L
(b) degenerated case 2: I + L
(c) degenerated case 3: I + 2L
Fig. 12.Degenerated cases for 3D U-shaped pattern refinement in OAPDST.
2) The 3D extended escape graph: The horizontal (vertical) edges on odd (even) layers are removed. (They are forbidden.) The edge cost on layer i is magnified by UCi, 1≦i≦Nl.
3) 3D U-shaped pattern refinement: Considering the PD constraints, a Steiner-vertex can only connect vias either with vertical edges or with horizontal edges. For a given U-shaped pattern formed by three vertices, the median of their coordinates may not be valid for a Steiner-vertex. However, the median point still can be a reference point to reroute the L-shaped segments on the pattern, so the strategy is
25
the same as that in ML-OARSMT.
Fig. 11(a) shows the instance given in [13]; assume Cv = 3, each grid size is 20×20 (unit wirelength)2, UCi =1 for all layers. Fig. 11(b)(c) show the corresponding DT, the obstacle-weighted MST, and the 3D extended escape graph. Fig, 10(d)(e) depicts the resulting OAPDST without refinement (cost = 233 (=10×20+11×3)), with refinement (cost = 227 (=10×20+9×3)), respectively. Fig. 12(a) shows the corresponding ML-OARSMT generated by our algorithm, cost = 218 (=10×20+6×3); it can be viewed as the lower bound of the cost of OAPDST. Fig. 11(b) shows the OAPDST given in [13], cost = 281 (=13×20+7×3), where a standard pattern is highlighted by bold lines. After refining this pattern, we can obtain a better tree in Fig. 11(c), cost = 261 (=12×20+7×3). Fig. 12 lists degenerated cases for refinement in OAPDST.
B. Global Routing
To include our Steiner tree construction to global routing, we shall consider the capacity of each edge on the global routing graph. Without loss of generality, assume the net ordering is given. It can be seen that on the 3D escape graph, if the capacity of some edge is full, then this edge can be set as forbidden; otherwise, this edge can still be used. After the RSMT is constructed, the capacity of the corresponding routed edges reduces. Moreover, considering the grids on upper metal layers are larger than lower ones, we may slightly shift the lines of the3D escape graph to align their nearest grids.
26
Chapter 4
EXPERIMENTAL RESULTS
We implemented our algorithm in C++ language and executed the program on a PC with an Intel Pentium4 3.0 GHz CPU and 1 GB memory under Windows XP OS.
Our results show our algorithm outperforms state-of-the-art works on SL-OARSMT, ML-OARSMT, and OAPDST. Meanwhile, our runtimes are also stable, not increasing much from SL-OARSMT to ML-OARSMT and OAPDST. In addition, the comparison between DT without and with obstacles is provided. Furthermore, the results of timing-driven RSMT are also provided.
TABLE II
SL-OARSMT: THE COMPARISONS ON THE TOTAL WIRELENGTH BETWEEN PREVIOUS WORK [6-10] AND OURS
1HPBB: The half-perimeter of the bounding box of all pin-vertices (which is a lower bound of total wirelength), and “-” refers to “not available.”
2Ours_SL: Our algorithm for SL-OARSMT.
3Nref: All steps of our algorithm are applied, but 3D U-shaped pattern refinement is turned off.
27
4Full: All steps of our algorithm are applied.
5Mst: Step 1 (DT) and step 2 (MST) of our algorithm are applied.
6,7Avg. (%): Average improvement is computed by averaging (X-G)/X, and (X-G)/(X-Y) for all cases, X = A, B, C, D, E, F.
A. SL-OARSMT
For SL-OARSMT, totally 14 benchmark circuits were provided by [8]; the first 3 from industry, the rest from [6]. We compared our algorithm with those presented in [6], [7], [8], [9], [10]. The results of [6] and [8] are quoted from their papers; those of [7] are quoted from [8]; those of [9] and [10] were conducted on our platform using their binary codes. (In addition, the parameter α was set to [0.50, 1.30] depending on the congestion.) As listed in TABLE II, considering the differences from the half-perimeter of the bounding box of all pins (which is a lower bound of the optimal solution), our algorithm achieved average 5.03% up to 26.88% improvement on wirelength over them. Moreover, we had the best results for 12 out of 14 cases. Fig.
13 shows the resulting SL-OARSMTs of sl-rc6 and sl-rc9. Without refinement, on average, we still have a small win to [8] and [9] on total wirelength. Novel 3D U-shaped pattern refinement worked well in planar cases and contributed 2.76%
reduction on wirelength. Because our method mainly focuses on multi-layer, the overhead on runtimes for single-layer is reasonable.
B. ML-OARSMT
For ML-OARSMT, totally 10 test cases were provided by [11]. ml-ind4 and ml-ind5 simulate the environment for single-layer routing, where all pins and obstacles are located in a layer, and the upper and lower adjacent layers are entirely occupied by another two large obstacles. Fig. 14 displays the ML-OARSMT of ml-ind2 generated
28
by our algorithm as Cv = 3.
We compared our algorithm with [11]. (In addition, the parameter _ was set to [0.70, 1.15].) As listed in TABLE III (IV), as Cv = 3 (5), the average improvements on the number of vias and total costs are 7.69% (4.76%), 2.77% (2.74%), respectively.
Our algorithm has smaller total costs in 9 out of 10 cases. In addition, our algorithm always generated a smaller total cost as Cv = 3 than that as Cv = 5 for each case; it can be seen that our algorithm is indeed stable.
TABLE III
ML-OARSMT: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COSTS, AND CPU TIMES BETWEEN [11] AND OURS UNDER CV = 3
1The runtimes of [11] are quoted from the paper, generated by a 2.8GHz AMD-64 machine with 8GB
memory under Ubuntu 6.06 OS. They are listed for reference because the machine is different.
2[11]_ref: 3D U-shaped pattern refinement is directly applied to the resulting ML-OARSMT of [11].
The runtimes of [11]_ref only count for refinement and are measured on our platform.
3Ours_ML: Our algorithm for ML-OARSMT. 3Imp. (%): Average improvement is computed by averaging (A-X)/A for all cases, X =B, C or D.
29
TABLE IV
ML-OARSMT: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COSTS, AND CPU TIMES BETWEEN [11] AND OURS UNDER CV = 5
1The runtimes of [11] are quoted from the paper, generated by a 2.8GHz AMD-64 machine with 8GB
memory under Ubuntu 6.06 OS. They are listed for reference because the machine is different.
2[11]_ref: 3D U-shaped pattern refinement is directly applied to the resulting ML-OARSMT of [11].
The runtimes of [11]_ref only count for refinement and are measured on our platform.
3Ours_ML: Our algorithm for ML-OARSMT. 3Imp. (%): Average improvement is computed by averaging (A-X)/A for all cases, X = B, C or D.
C. OAPDST
For OAPDST, totally 10 test cases are used. 7 cases are exactly the same as those used in ML-OARSMT. We did not use ml-ind3 because it is invalid under the PD constraints. pd-ind4, pd-ind5a, pd-ind5b were modified from ml-ind4 and ml-ind5.
For pd-ind4 and pd-ind5a, we inserted one empty layer right above the working layer.
For pd-ind5b, we further duplicated the obstacles in the working layer onto the inserted layer. In addition, the routing cost UCi was set to 1 for all i. By doing so, we can see how worse an OAPDST can be with respect to its ML-OARSMT counterpart.
(In addition, the parameter α was set to [0.70, 1.00].) Fig. 15 displays the OAPDST of ml-ind2 generated by our algorithm as Cv = 3.
We compared our algorithm with [13]; because we cannot obtain the test cases
30
and the program of [13], we implemented their algorithm and executed it on the same machine described above.
As listed in TABLE V, as Cv = 3, the average degradation of the total costs from ML-OARSMT to OAPDST is 6.47%, but the average speedup of CPU times is 60.18%. The average improvement of the total costs over [13] is 3.20%, while the CPU times are almost the same. Our algorithm has smaller total costs in 9 out of 10 cases.
TABLE VI compares the impacts of the obstacle-weighted MST and 3D U-shaped pattern refinement of our algorithm. It can be seen that without the guidance from the MST, on average, we may have an 8.76% degradation on the total costs, and the CPU times surprisingly become much worse (38.88% slower). Hence, steps 1 and 2 are necessary; actually, they are efficient and effective. On the other hand, although 3D U-shaped pattern refinement does not influence much on our results, it does improve the total costs of [13] by 2.66% on average. The refined results of [13] are still slightly worse than ours when refinement is turned off. Although not presented here, we have similar results for Cv = 5, and UCi ≠ 1.
31
TABLE V
OAPDST: THE COMPARISONS ON THE NUMBER OF VIAS, THE TOTAL COST, AND CPU TIMES BETWEEN [13] AND OURS UNDER CV = 3, UCI = 1,
1≦I≦NL
1Ours_ML: Our algorithm is applied without the PD constraints; it can be viewed as the lower bound of
the total cost for OAPDST.
2Ours_PD: All steps of our algorithm for OAPDST are applied. 3Imp. (%): Average improvement is computed by averaging (X-B)/X for all cases, where X = A, C.
TABLE VI
OAPDST: THE COMPARISONS ON THE IMPACTS OF OUR ALGORITHM ON THE TOTAL COST AND CPU TIMES UNDER CV = 3, UCI = 1, 1≦I≦NL
1Nmst: Only step 3 of our algorithm is applied, i.e., the tree is directly constructed from the 3D
extended escape graph.
2Nref: All steps of our algorithm are applied, but 3D U-shaped pattern refinement is turned off.
3[13]_ref: 3D U-shaped pattern refinement is applied to [13].
4Imp. (%): Average improvement is computed by averaging (X-B)/X for all cases, where B is Ours_PD
32 in TABLE IV, X = D, E, F.
Fig. 13.(a) The SL-OARSMT of sl-rc6. (b) The SL-OARSMT of sl-rc9.
33
Fig. 14.The ML-OARSMT of ml-ind2 under Cv = 3. (a) The DT without illegal edges.
(b) The MST. (c)-(g) Layers 2-6, respectively. (h) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.
34
Fig. 15.The OAPDST of ml-ind2 under Cv = 3. (a) The DT with illegal edges. (b) The MST. (c)-(g) Layers 2-6, respectively. The odd (even) layers allow vertical (horizontal) edges. Some line segments are at obstacle boundaries; they are feasible according to the problem formulation. (h) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.
35
Chapter 5 CONCLUSION
In this thesis, we solved ML-OARSMT and OAPDST by the same strategy. In addition, we also showed our method can be extended to construct timing-driven RSMTs. Previous work tackles one configuration at a time, while our algorithm can easily handle various configurations. Experimental results showed that our algorithm outperformed the state-of-the-art works. Future work includes the extensions to clock trees and manufacturability-aware trees.
36
REFERENCES
[1] The International Technology Roadmap for Semiconductors (ITRS), 2007.
Available: http://www.itrs.net/
[2] M. R. Garey and D. S. Johnson, “The rectilinear Steiner tree problem is NP-complete,” SIAM J. Appl. Math., vol. 32, no. 4, pp. 826-834, 1977.
[3] J. L. Ganley and J. P. Cohoon, “Routing a multi-terminal critical net: Steiner tree construction in the presence of obstacles,” in Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS’94), vol. 1, May 1994, pp.113-116.
[4] M. de Berg, O. Cheong, M. van Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications, 3rd ed., Springer-Verlag, 2008.
[5] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed., MIT Press, 2001.
[6] Z. Feng, Y. Hu, T. Jing, X. Hong, X. Hu, and G. Yan, “An O(nlogn) algorithm for obstacle-avoiding routing tree construction in the lambda geometry plane,” in Proc. ACM Int. Symp. on Physical Design (ISPD’06), Apr. 2006, pp. 48-55.
[7] Z. Shen, C. C. N. Chu, and Y.-M. Li, “Efficient rectilinear Steiner tree construction with rectilinear blockages,” in Proc. IEEE Int. Conf. on Computer Design (ICCD’05), Oct. 2005, pp. 38-44.
[8] P.-C. Wu, J.-R. Gao, and T.-C. Wang, “A fast and stable algorithm for obstacle-avoiding rectilinear Steiner minimal tree construction,” in Proc.
ACM/IEEE Asia and South Pacific Design Automation Conf. (ASP-DAC’07), Jan.
2007, pp. 262-267.
[9] C.-W. Lin, S.-Y. Chen, C.-F. Li, Y.-W. Chang, and C.-L. Yang,
37
“Obstacle-avoiding rectilinear Steiner tree construction based on spanning graphs,” IEEE Trans. Computer-Aided Design, vol. 27, no. 4, pp.643-653, Apr.
2008. Also see Proc. ACM Int. Symp. on Physical Design (ISPD’07), pp.127-134.
[10] J. Long, H. Zhou, and S. O. Memik, “An O(nlogn) edge-based algorithm for obstacle-avoiding rectilinear Steiner tree construction,” in Proc. ACM Int. Symp.
on Physical Design (ISPD’08), Apr. 2008, pp. 126-133
[11] C.-W. Lin, S.-L. Huang, K.-C. Hsu, M.-X. Lee, and Y.-W. Chang, “Multilayer obstacle-avoiding rectilinear Steiner tree construction based on spanning graphs,”
IEEE Trans. Computer-Aided Design, vol. 27, no.11, pp. 2007-2016, Nov. 2008.
Also see Proc. IEEE/ACM Int. Conf. on Computer-aided Design (ICCAD’07), pp.380-385.
[12] M. C. Yildiz and P. H. Madden, “Preferred direction Steiner trees,” IEEE Trans.
Computer-Aided Design, vol. 21, no. 11, pp. 1368-1372, Nov.2002.
[13] C.-H. Liu, Y.-H. Chou, S.-Y. Yuan, and S.-Y. Kuo, “Efficient multilayer routing based on obstacle-avoiding preferred direction Steiner tree,” in Proc. ACM Int.
Symp. on Physical Design (ISPD’08), Apr. 2008, pp.118-125.
[14] I. H.-R. Jiang, S.-W. Lin, and Y.-T. Yu, “Unification of obstacle-avoiding rectilinear Steiner tree construction,” in Proc. IEEE Int. SOC Conf. (SOCC’08), Sep. 2008.
[15] I. H.-R. Jiang and Y.-T. Yu, “Configurable rectilinear Steiner tree construction for SoC and nano technologies,” in Proc. IEEE Int. Conf. on Computer Design (ICCD’08), Oct. 2008, pp. 34-39.
38
APPENDIX
Timing-Driven Steiner Trees
In addition to preferred directions, our method can also be applied to other types of Steiner trees. For example, a timing-driven RSMT targets to minimize the path length between each pin to the designated source pin. It can be done by constructing a shortest-path tree instead of MST at step 2. The shortest-path tree is constructed by Dijkstra’s shortest path algorithm. Moreover, if step 3 still remains the same, we may obtain a compromise between the path length and the total cost.
For timing-driven RSMT, the source pin is randomly assigned in our experiments.
TABLE VII lists the results for timing-driven RSMT. Fig. 16 displays the timing-driven OAPDST of ml-ind2 generated by our algorithm as Cv = 5.
TABLE VII
TIMING-DRIVEN RSMT
1Original: The results of our algorithms to minimize total cost are quoted from TABLE II, III, and V.
2Timing-Driven: To minimize the delay from source to each sink, the minimum spanning tree is replaced with shortest path tree at step 2. The rest of our algorithm is unchanged.
39
Fig. 16.The timing driven OAPDST of ml-ind2 under Cv = 5. DT is the same as Fig.
15 (a). (a) The SPT. (b)-(f) Layers 2-6, respectively. The even (odd) layers allow vertical (horizontal) edges. Some line segments are at obstacle boundaries; they are feasible according to the problem formulation. (g) All pin-vertices are projected onto a pseudo plane, without showing the obstacles.