In this chapter, an O-tree based track assignment is proposed to deal with variable-width and variable-space IRoutes. To our knowledge, this is the first study to discuss gridless track assignment. From the observations in [4, 19], the two factors heavily affecting coupling capacitance are the space and overlap length between two wires. One important characteristic for a gridless routing problem is different wire width and space rules for the nets. The experiments of [19] show that the wire width has only a little impact on coupling effects with
about 0.4% to 7% coupling effect variation while the wire width is enlarged twice or even triply. Therefore, the coupling effect estimation ignores the wire width and the model presented in Chapter 2.1 is applied for the gridless track assignment problem. The space between any two adjacent IRoutes is assumed to be fixed in the following discussion. The discussion of extension of dealing with variable space rules will be outlined in the end of this chapter.
Assume that the space between any two adjacent IRoutes is sp, the IRoutes are over-sized by sp/2 to guarantee the separation legality between IRoutes. Therefore, the over-sized IRoute can overlap with its adjacent IRoutes and only these overlapping adjacent IRoutes are considered to induce coupling capacitance, that is, any two IRoutes, which has a non-zero separation between them, is free of crosstalk effect. After over-sizing original IRoutes, the gridless track assignment problem becomes a special placement problem. Each over-sized IRoute can be regarded as a block with a constraint of locating at fixed x-coordinate. The objective is to find a complete placement of minimum block abutting length within a routing region of fixed height. One important thing for placement is how to represent and maintain the contour of partial placement. B*-tree [20] and O-tree [21] are two well-known methods to represent non-slicing placement. B*-tree is ideal for 2-dimensional move and O-tree is ideal for one-dimensional move. This study applies O-tree to represent the assignment for fast IRoute swap and move. The O-tree of a placement can be constructed as follows. Assume the placement is a T-compact placement, where there is no block that can be shifted upwards from current position with other blocks fixed. For a vertical O-tree, there is a root node on the top to represent the top boundary and there is a node representing each block. The root node has an edge directing to the nodes whose top border is located at the top boundary. For two block nodes, say bi and bj, there is an edge from bi to bj if bi and bj abut and bi is on the top of bj. In this study, the O-tree is enhanced by additionally adding an edge between two blocks with non-zero overlapping length if they can see each other and called extended O-tree. Figure
11(a) shows a T-compact placement, Fig. 11(b) shows its related vertical O-tree and Fig. 11(c) demonstrates the extended vertical O-tree, where the dashed edges are the edges that do not appear in the O-tree. In Fig. 11(a), blocks 2 and 7 have non-zero overlapping, but there is no directed edge between them since they are separated by blocks 4 and 5. On the contrary, blocks 1 and 7 can see each other, so there is an edge from block 1 to block 7.
(a) (b) (c)
Figure 11. (a) A T-compact placement; (b) related vertical O-tree; (c) extended vertical O-tree.
OTTA contains three steps: (1) initial assignment, (2) extended O-tree based assignment refinement (EOAR), and (3) sub-panel rearrangement.
(1) Initial assignment: the goal of initial assignment is a fast production of an assignment with good utilization. For grid-based track assignment, left-edge algorithm can be well applied to obtain a utilization-driven initial assignment. Since gridless track assignment probably produces uneven partial assignment, it is hard to regard the region as row by row. Considering the crosstalk minimization objective, initial assignment combines the minimum weighted Hamiltonian path on the maximum clique and similar concept to left-edge algorithm to balance crosstalk minimization and track utilization. An OLG is first established and a maximum clique is found. The IRoutes in the maximum clique is assigned to the tracks in the order of a minimum weighted Hamiltonian path [16]. After minimizing the crosstalk induced by the most congested IRoute group, the unassigned IRoutes, which locate at the right side of the partial assignment are sorted by their left borders in an increasing order. The
remaining IRoutes are processed in the sorting order and each IRoute is assigned to the topmost available space. The unassigned IRoutes, which locate at the left side of the partial assignment, is sorted by their right borders in a decreasing order. Each IRoute is also assigned in the sorting order to the topmost available space. Figures 12~15 show the process of an initial assignment.
(2) Extended O-tree based assignment refinement (EOAR): after the initial assignment is produced, the corresponding extended O-tree of the over-sized IRoute placement is established. Each node of the extended O-tree stands for an IRoute and there exists an edge between two IRoutes if there is a non-zero vertical projection between them.
Each edge contains two costs, that is, overlap length and separation distance. Since the over-sized IRoute in the placement has considered the separation rule, two over-sized IRoutes separated by non-zero space are assumed to be free of crosstalk effect. Four operations, DeleteNode, InsertNode, PlowTree, and CompactTree, on the O-tree are supported to perform crosstalk minimization. InsertNode is adding a node to the O-tree and DeleteNode is deleting a node from the O-tree. PlowTree is to reserve a space for a node insertion. PlowTree is to plow all the nodes whose vertical range contains a given horizontal line, called plow line, downwards by a distance of the height of the block to be inserted. For a node insertion between two nodes, say bt (on the top) and bb (on the bottom), the extension line of the top border of the bottom node bb is the plow line. It seems that it is more efficient to perform PlowTree if only the nodes whose horizontal range overlaps with the node to be inserted rather than all the nodes that intersect with the plow line. As a matter of fact, plowing all nodes can guarantee the success of a node insertion. For example, if a node is to be inserted on the top of Node 11, as shown in Fig. 17, and the plowing distance is larger than the separation distance between blocks 2 and 8, then the node insertion will fail with only plowing the sub-tree of Node 11. CompactTree is to compact the blocks upwards to make the placement T-compact. A node is said to be movable if all its incident edges have non-zero distance cost. CompactTree is achieved by applying breadth-first search on the extended O-tree to pull upwards those movable nodes. The height of an assignment is the maximum path length, where a path length is the sum of the separation distances of all edges along the path and the block heights of all nodes
along the path. The total coupling capacitance cost is the total overlap length of all zero-separation edges. Figure 16 demonstrates the extended O-tree of the assignment in Fig. 15.
EOAR performs the same procedure for each node. For each node, EOAR first deletes the node from the extended O-tree and then inserts it on the top of all nodes overlapping with it one at a time. The assignment of minimum crosstalk effect is realized and CompactTree follows to make the assignment T-compact. EOAR allows each IRoute to move far away and only considers the crosstalk minimization individual IRoute and its neighbors. Further crosstalk reduction can be achieved by taking the effect between rows into account.
(3) Sub-panel rearrangement: considering the global crosstalk effect, the assignment on a panel can be regarded as assignments on several sub-panels. For example, the assignment in Fig. 22 can be split into six sub-panels. Rearrangement of these sub-panels can further reduce crosstalk effect. A sub-panel OLG is first constructed, where each node in the graph represents a sub-panel and there are two directed edges between any two nodes. Since the top and bottom contours of a sub-panel are not symmetrical, two sub-panels can be assigned in two ways. The cost of an edge is the total overlap length of two sub-panels associated with the end nodes of the edge.
When calculating the overlap length of two sub-panels, only the over-sized IRoutes, which touch the abutting border, rather than the whole boundary contours are considered because two over-sized IRoutes with non-zero separation is free of crosstalk effect in our model. Two directed edges between two sub-panels can be regarded as a pseudo edge and the sub-panel OLG can be treated as a complete graph. A partial sub-panel OLG of the assignment in Fig. 22 is shown in Fig. 23.
After the assignment is partitioned into as many sub-panels as possible, as shown in Fig. 22, the crosstalk minimization problem can be formulated as the problem of finding the minimum weighted Hamiltonian path (MWHP) on the sub-panel OLG. The heuristics algorithm for finding MWHP on an IRoute OLG in [16] can be well applied with little modification. The MWHP searching process starts from the node with maximum inward edge cost or maximum outward edge cost. If the maximum cost is caused by the outward edge, the sub-panel is mirrored and then placed on the top of the panel. Next, the outward edge of the least cost of the start node and its another end node are included in the Hamiltonian path if another end node has been visited yet before. The new included node becomes the new start node in next iteration. This process continues until all nodes have been visited. The node sequence along the MWHP forms a new sub-panel order on the panel. Figure 24 shows a new sub-panel order for the assignment in Fig. 22. If original panel is very loose and there are empty sub-panels, they will be inserted to separate the sub-panels to reduce the overlap length; furthermore, local refinement such as pulling IRoutes upwards if there are space on tier top can compact the assignment without increasing crosstalk effect. Figure 25 shows the final assignment after local refinement.
Chapter 4
Experimental Results
The proposed HZTA, SMTA and OTTA algorithms were implemented in the C++ language.
The tests for benchmark circuits were executed on an Intel 2.4GHz PC with 768M RAM. For grid-based track assignment, Table 1 lists the statistics of eight small cases, which come from the examples of channel routing papers, and eight benchmark circuits. To compare with the work in [16], the TA algorithm in [16] is implemented and performed on the same machine.
Table 2 compares the test results with those in [16]. The cost in the first column of each method in Table 2 is the total crosstalk of TA, i.e., the total overlapping length in the panel.
The TA of [16] fails to complete the assignment for test1. Both HZTA and SMTA complete the assignment of test1; furthermore, SMTA obtains less crosstalk than the method in [8].
Since the panels in S-series benchmarks are very loose, HZTA and SMTA can produce assignments with zero overlap length. The algorithm in [16] does not consider the case of loose panel, so it still produces assignment of non-zero overlap length. To avoid unjust comparison, the crosstalk reduction rate does not count in the results of these benchmark circuits. In summary, HZTA and SMTA complete the assignment and achieved 42.49% and 46.79% better crosstalk reduction, respectively, than the method in [16]. The crosstalk budget of SMTA is basically assigned equal to the maximum overlap length in HZTA and it can be a little adjusted to acquire better results.
For gridless track assignment, the IRoute width is generated randomly. The wire width of five percent IRoutes is tripled and the wire width of fifty percent IRoutes is doubled, while the others remain unchanged. Table 3 depicts the test case information and Table 4 depicts the
reduction rate of entire overlap length in each stage. Sub-panel rearrangement achieves more gain than extended O-tree based assignment refinement.
Table 1. The information of test cases.
Case name No. of nets Track size Panel length (No. of GCells)
test1 12 5 20
Table 2. The comparisons for three CTA algorithms.
Result of [16] HZTA SMTA
mcc1 18088 0 1.382 834 0 9.953 95.38% 646 0 1.406 96.42%
mcc2 227842 0 21.453 5968 0 180.828 97.38% 3590 0 10.06 98.42%
S9234 462 0 0.047 0 0 0.687 ****** 0 0 0.062 ******
Table 3. The test case information of OTTA.
Case name No. of nets Total Width (µm) Total Height (µm) Column (GCells) Panel (GCells)
S9234 2774 403988 224994 26 14
Table 4. The result of OTTA.
(1) Initial assignment (2) extended O-tree based assignment refinement
S9234 14584798 13040730 10.58% 4641932 68.17% 0.625 S5378 26164870 21124872 19.26% 11956692 54.14% 0.703 S13207 59607946 46065118 22.71% 23539288 60.50% 1.05 S15850 77821208 56293274 27.66% 26615526 65.79% 1.156 S38417 119548232 99007772 17.18% 34013838 71.54% 1.765 S38584 172762856 130657404 24.37% 61779606 64.24% 2.188
Average 17.43% 64.6%
OL after (1, 2, 3): total overlap length after step (1, 2, 3) (µm); R.R. (reduction rate) : ( OL(1) – OL(2, 3) / OL(1) ).
Chapter 5 Conclusions
This thesis proposes three utilization- and crosstalk-driven TA algorithms, HZTA, SMTA, and OTTA. HZTA processes odd-numbered tracks row by row and even-numbered tracks zone by zone, while SMTA reduces crosstalk effect by moving and swapping critical IRoutes based on an initial assignment. In this thesis, the first griddles TA algorithm is also proposed.
Based on the proposed extended O-tree and the four underlying operations on the extended O-tree, say DeleteNode, InsertNode, PlowTree, and CompactTree, each IRoute has chance to escape from the original position assigned by the initial assignment through the above four operations. Global crosstalk reduction can be further achieved by sub-panel rearrangement.
Experimental results show that HZTA has larger crosstalk reduction rate by 42.49% than the result in [16], while SMTA algorithm reduced crosstalk 46.79%. Both HZTA and RBTA can complete the assignment for all test cases. Finally, OTTA can reduce the coupling effects by 64% than the initial assignment in average.
Bibliography
[1] A. B. Kahng and S. Muddu, “New Efficient Algorithm for Computing Effective Capacitance,” Proceeding of International Symposium on Physical Design, pp.
147–151, Apr. 1998.
[2] S. Tani, Y. Uchida and M. Furuie, “Parasitic Capacitance Modeling for Multilevel Interconnects,” Asia-Pacific Conference on Circuits and Systems, Vol. 1, pp. 28-31, Oct. 2002.
[3] S. Tani, Y. Uchida and M. Furuie, “Parasitic Capacitance Modeling for Multilevel Interconnects,” Asia-Pacific Conference on Circuits and Systems, Vol. 1, pp. 28-31, Oct.
2002.
[4] S. W. Tu, W. Z. Shen, Y. W. Chang and T. C. Chen, “On-Chip Inductance modeling for coplanar interconnect structure,” Proceeding of IEEE International Symposium on Circuit and System, Vol 3, pp. 787-790, 2002.
[5] S. M. Sait and H. Youssef, “VLSI physical design automation,” World Scientific Publishing, 1999.
[6] A. Kahng and G. Robins, “A New Class of Steiner Tree Huristics with Good
Performance the Iterated 1-Steiner Approach,” IEEE/ACM International Conference on Computer Aided Design,1990.
[7] H. Zhou, “Efficient Steiner Tree Construction Based on Spanning Graphs,” IEEE Transactions on Computer Aided Design, pp. 704-710, May 2004.
[8] Lee, C. Y, “An Algorithm for Path Connections and Its Applications,” IRE Trans.
Electronic Computers, pp. 346-365, Sep. 1961.
[9] J. Soukup, “Fast Maze Router,” Proceedings of Design Automation Conference, pp.
100-102, 1978.
[10] S. W. Hur, A. Jagannathan and J. Lillis, “Timing Driven Maze Routing,” International Symposium on Physical Design, pp.208-213, Apr. 1999.
[11] S. Batterywala, N. Shenoy, W. Nicholls and H. Zhou, “Track assignment : A Desirable Intermediate Step Between Global Routing and Detail Routing,” IEEE/ACM International Conference on Computer Aided Design, pp. 59 – 66, Nov. 2002.
[12] H. Zhou and D. F. Wong, “Global Routing with Crosstalk Constraints,” Design Automation Conference, pp.374-377, May 1998.
[13] J. Xiong and L. He, “Full-Chip Routing Optimization With RLC Crosstalk Budgeting,” IEEE Transactions on Computer Aided Design, pp. 366-377, Mar. 2004.
[14] J. D. Cho, S. Raje and M. Sarrafzadeh, “Crosstalk-Minimum Layer Assignment,”
IEEE Custom Integrated Circuits Conference, pp. 29.7.1-29.7.4, May 1993.
[15] Di Wu, J. Hu, R. Mahapatra and M. Zhao, “Layer Assignment for Crosstalk Risk Minimization,” Design Automation Conference, pp. 159-162, Jan. 2004.
[16] T. Y. Ho, Y. W. Chang, S. J. Chen and D.T. Lee, “A Fast Crosstalk- and Performance-Driven Multilevel Routing System,” IEEE/ACM International Conference on Computer Aided Design, pp. 382-387, Nov. 2003.
[17] T. Gao and C. L. Liu, “Minimum Crosstalk Channel Routing,” IEEE/ACM International Conference on Computer Aided Design, pp. 692-696, 1993.
[18] L. E. Liu and C. Sechen, “Multi-Layer Chip-Level Global Routing Using an Efficient Graph-based Steiner Tree Heuristic,” Proceeding of the European Design and Test Conference, pp. 331-318, 1997.
[19] L. He and M. Xu, “Modeling and Layout Optimization for On-Chip Inductive Coupling,” U. of Wisconsin at Madison, Technical Report ECE-00-1, Dec 1999.
[20] Y.C. Chang, Y. W. Chang, G. M. Wu, and S. W. Wu, “B*-trees : A New Representation for Non-slicing Floorplans,” Proceeding of ACM/IEEE Design Automation
Conference, pp. 458-463, June 2000.
[21] P. N. Guo, C. K. Cheng, and T. Yoshimura, “An O-tree Representation of Non-Slicing Floorplan and Its Applications,” Annual ACM IEEE Design Automation Conference, pp. 268-273, 1999.