Background - 用於擺置繞線流程的可繞度和效能最佳化技術

Chapter 1 Introduction

1.2 Background

In global routing problem, typically the given placement solution is partitioned into a 3-dimension (3D) array of global cells (Fig. 1.3(a)), and then the array of global cells is modeled to a 3D grid graph (Fig. 1.3(b)). Generally, there are two strategies to deal with global routing problem on the 3D grid graph. One directly performs global routing on a 3D grid graph [3-6]. Although directly performing global routing on a 3D grid graph may achieve a better result, it is time-consuming. Thus, the mainstream approach is to condense 3D grid graph into 2D grid graph first, and then peform 2D global routing to obtain a 2D routing result. Finally, layer assignment algorithms [17, 27-33] assign each

routing wire to the corresponding metal layers to obtain a 3D global routing result [7-21]. Figure 1.3(c) shows the general flow adopted in most global routers to tackle 3D global routing porblem, the functions of each stage are detailed in the follows.

1.2.1 Building Grid Graph and the Objectives of Global Routing

In the grid graph, each grid node refers to a global cell (G-cell), and each grid edge corresponds to a boundary between two abutting global cells in the same layer. Meanwhile, each via edge connects two abutting G-cells in two adjacent layers. The number of routing tracks that can be accommodated across the abutting boundary is defined as the capacity c(e) of a grid edge e, and the number of wires that pass through grid edge e is called grid edge’s demand d(e). The overflow of a grid edge e is defined max(d(e)-c(e), 0), the total overflow is the sum of overflows on all grid edges, and the maximum overflow is the maximum overflow among all edges. For simplicity, the capacity of each via edge is not limited, which is also adopted in most of global routing researches [3-21]. Given the pins' locations of each net distributed on the grid graph, the objective of global routing problem is to identify a highly routable global path to connect the pins of each net. The quality of a global routing result is generally measured by the total overflow and wirelength.

Figure 1.4 shows how to compute the capacity of 2D grid edges in the mainstream flow of 2D global Fig. 1.3 (a) partition a placement into a 3D array of G-cells; (b) model the 3D array of G-cells into a grid graph; (c) typical global routing flow.

(a) (b) (c)

routing with layer assignment, in which the numbers next to 3D grid edges denote the capacity of the 3D edges. After the 3D grid graph is compacted to a 2D graph, the capacity of a 2D grid edge is obtained by adding up the capacities of its corresponding 3D grid edges.

1.2.2 Net Decomposition

Most global routers decompose each multi-pin net into two-pin subnets, because net decomposition can simplify a multi-terminal routing problem to a two-terminal routing problem. Before routing stages, the rectilinear Steiner minimal tree (RSMT) or rectilinear minimum spanning tree (RMST) construction algorithms are commonly used to generate the initial topology for each multi-pin net and then each multi-pin net is decomposed into two-pin subnets based on its topology. For example, Figs. 1.5(a) and 1.5(b) show the initial topologies of a four-pin net generated by RSMT and RMST, respectively, in which the green rectangle denotes a Steiner point, and the topologies of the four-pin net in Figs. 1.5(a) and 1.5(b) can be decomposed to 4 and 3 two-pin subnets, respectively. Because a RSMT has shorter wire length than a RMST has, net decomposition by RSMT is popular in many literature. FLUTE [23]

is a very fast and accurate RSMT construction tool, which is widely used by many modern global routers. FLUTE not only quickly constructs a good RSMT for a multi-pin net, but also obtains optimal RSMTs for nets with nine or fewer pins. However, FGR [3] indicates that the RSMT has less routing

10 0

5 1

0 10

15 11

Fig. 1.4. modern 3D global routing flow.

flexibility than the RMST as it owns Steiner points and generates more flat segments than the RMST, and the used data structure of RSMTs is more complex than that of RMSTs. On the contrary, the RMST can simply complete each subnet’s routing with pattern routing or monotonic routing to avoid congestion regions. Consider wirelength and routing flexibility, in which a RMST that encourages multiple two-pin routings to merge together with multiple paths that pass through the same grid edges (Fig. 1.5(c)). This ideal solution avoids passing through congested regions by using a shorter total wire length than that of a RMST that does not encourage finding joint wires. However, how to identify a RMST with joint wires is a challenge.

1.2.3 Pattern Routing and Monotonic Routing

Pattern routing adopts specific routing patterns to connect two pins. The most common patterns are L-shaped or Z-shaped. The main advantage of pattern routing is that it can complete the path searching in a very short time, but its solution space is very tiny. To mitigate the huge performance gap between pattern routing and maze routing, Pan et al. [14] present monotonic routing to enrich the solution space.

Monotonic routing uses the dynamic-programming technique to identify a routing path from the source to the target without any detour. The time complexity of monotonic routing in a m  n grid graph is O(mn), which is the same as that of the Z-shaped pattern routing.

1.2.4 Negotiation-based Rip-up and Rerouting (NRR)

Rip-up and re-routing technique is widely used in global and detailed routing. Given an illegal Fig. 1.5. Four-pin net decomposition by (a) RSMT; (b) RMST; (c) RMST with a joint wire, subnets n₁ and n₂ share a joint wire.

(a) (b) (c)

routing solution, rip-up and rerouting technique iteratively removes the nets with violations and reroutes them sequentially to expel violations. In global routing problem, a violation occurs when an overflow is produced. Widely, the negotiation technique, as proposed in PathFinder [22], is associated with rip-up and re-routing technique (NRR) in modern global routers to improve the ability of overflow removal.

The main idea of NRR is to increase the penalty of a grid edge at current iteration that overflowed at the previous iteration. Thus, path searching intends to avoid passing previously overflowed grid edges. [22]

formulates the negotiation-based routing cost of grid edges e as follows,

e denotes the congestion penalty. The history cost he increases as overflow occurs. The value of he in the (k+1)-th iteration is given by: another formula to preserve the base cost as follows.

e e e

e b h p

c    . (1.3)

Several variations of negotiation-based cost functions have been discussed in [10-12, 16-17].

1.2.5 Layer Assignment

The goal of layer assignment in global routing is to translate a 2D global routing result into a 3D result on minimizing the number of vias while not changing routing topology or increasing any overflows, which is called the congestion-constraint layer assignment problem. Congestion-constrained layer assignment problem for via minimization has been proven to be NP-complete [34] and extensively studied. BoxRouter2.0 [9] adopted integer linear programming to minimize via count minimization.

FGR [3] greedily assigned net edges to the corresponding metal layers by heuristics. Lee et al. proposed an efficient sequence layer assignment algorithm called COLA [27], which determined net assignment

order at first and then assigning each net to the appropriate layer by a dynamic-programming technique.

FastRoute 4.0 [16] decomposes multi-pin nets to two-pin net, then using the dynamic-programming algorithm to assign each two-pin net one bye one. Dai et al. [17] presented a congestion-relaxed layer assignment with a layer shifting algorithm, followed by net rip-up and re-assigning to further reduce the number of vias. In addition, some researchers extended the layer assignment problem to consider via overflow [28-29], double patterning [30], timing [31], and antenna effect [32-33].

1.2.6 Comparison of Recent Global Routers

Table 1.1 lists the well-know global routers developed in recent six years. Although most global routers in Table 1.1 are based on the global routing flow shown in Fig. 1.3(c), they have different opinions on several issues. Table 1.2 shows the issues that are widely discussed in recent global routing researches. For instance, the routers in [3, 7, 71] use RMST to be the initial tree topology for each net, while the routers in [16, 71] use RSMT; NTHU-Route [11] reroutes the nets in the un-congested region earlier, while the routers in [3, 12, 17] reroutes the nets in the congested region earlier; Box-Router [9, 70] rips-up a set of nets and then reroutes these nets one by one, while the routers in [4, 7, 11, 17] rip-up a net and then reroute it immediately. On the parallel routing issues, GRIP [4, 5] parallelize global routing on a cluster computing environment, NCTU-GR [18, 71] performs on a many-core server, the router in [19] performs on a GPU-CPU hybrid platform.

Net decomposition [3, 7 11, 16, 71] Routing algorithms [6, 8, 10, 12, 14, 16, 71]

Routing nets ordering [3, 11, 12, 17] Layer assignment approaches [3, 9, 16, 17]

Rip-up and rerouting scheme

[4, 7, 9. 11, 17, 70]

Routing cost formulation [3, 7, 10, 11, 12, 13, 16, 17, 69, 71]

Multi-threaded routing [4, 5, 19, 71]

TABLE 1.2 THE ISSUES DISCCUSED IN RECENT GLOBAL ROUTING RESEARCHES

NTHU-Route [69, 11] FastRoute [13-16] FGR [3, 7] MGR [6]

NTUgr [12] Box-Router 2.0 [70, 9] NCTU-GR [17, 18, 71] Archer [10]

GRIP [4, 5] HybridGR [19] Maize-Router [8]

TABLE 1.1 RECENT GLOBAL ROUTING RESEARCHES

Chapter 2 Grace: A Fast Global-routng-based

在文檔中用於擺置繞線流程的可繞度和效能最佳化技術 (頁 17-23)