Chapter 1 Introduction
1.2 Related Works
In general, how a design gets partitioned into different vertical layers of a 3D logic structure basically determines how many TSVs are mandatory for signal connection among those vertical layers. In the past few years, several previous works have already been proposed to tackle the problem of 3D partitioning for TSV minimization. One solution is to model the problem as an integer linear programming (ILP) problem [18], however, whose runtime grows exponentially as problem size increases. In [19][20], each of them develops a modified FM-based [21] partitioning method to obtain the resultant layer assignment aggregately at a time, not layer by layer. For this layer-aware algorithm, there is a brief introduction in the later section about [19] proposed on ISQED 2010. Meanwhile, the authors of 3D FPGA synthesis frameworks TPR [22]–[24] and MEANDER [25][26] alternatively use a two-step approach – first applying the well-known partitioning algorithm hMetis [27][28] to divide a design into layer-unaware partitions, and then assigning each partition to its target layer – to accomplish 3D design partitioning. In MEANDER, the authors assign each part to layer randomly, while EV-matrix [22] is used in TPR. We will detail how it runs a few simple and easy steps to minimize the number of TSVs in the next section.
From these related works, we can figure out the importance to the partitioning process.
Because it is the first step to translate the 2D netlist into the 3D structure, the partitioning result directly influences the number of TSVs seriously. In our work, we try to use a layer-aware algorithm to iteratively minimize the number of TSVs.
1.2.1 EV-Matrix
Here we introduce a linear-placement method doing layer assignment after a min-cut partitioning in TPR [23]. It uses an EV-matrix [22] to model the 3D structure. After a min-cut partitioning like Figure 5, the graph is mapped into an EV-matrix as Figure 6(a). It is an m × n
matrix where m (the number of rows) is the number of edges in the graph and n (the number of columns) is the number of parts. An element a(i, j) = 1 in the matrix is nonzero if the j-th part is a terminal of the i-th net. If a part is not a terminal for a net, the corresponding EV-matrix element is 0. The bandwidth of a matrix is defined as the maximum distance between the first and last nonzero entries among all rows. In other words, the bandwidth is associated to the number of cuts of each edge in Figure 6(b). There are two goals:
i) To minimize the total cuts: it makes the bandwidth as small as possible on each row.
ii) To minimize the maximum cut size: it makes all of the 1’s as closed the main diagonal as possible.
Figure 5. A partitioned graph.
1 0 1 0 0 0
Initial: Total cutsize = 11, Max-cut = 3
(a) (b)
Figure 6. EV-matrix.
In order to complete the objective, it only needs to move columns and rows. Although it is effective in time, it still has some disadvantages. Firstly, it cannot provide a good solution especially in hypergraph. Secondly, though hMetis is an efficient and effective min-cut multi-way partitioning tool, it lacks for layer-aware concept. That is, a typical 2D partitioning algorithm basically gives a similar weight to a cut between any two partitions, while that weight can be very different in 3D partitioning and highly depends on whether those two partitions (i.e., layers) are closed or far away from each other. Hence, the layer-unaware algorithms usually fall into the local minimum solution.
1.2.2 Multilevel Multilayer Partitioning Algorithm for 3D ICs
In this section we introduce another algorithm different from layer-unaware method.
Multilevel multilayer partitioning algorithm [19] modifies the multilevel k-way min-cut partitioning algorithm. It is layer-aware by redefining the cut calculation fitted for 3D ICs.
Gate-Level Netlist
Construct Data Structure
Do Coarsening Steps Stop?
Start Cell Library
Coarsening
Initial-k-Layer-Partition
k-Layer-Partition
Do Uncoarsen Steps Stop?
Finish Yes
Yes No
No
Figure 7. The flow of multilevel multilayer partitioning algorithm.
Figure 7 is the overall flow of this algorithm. At the beginning, the coarsening phase clusters the cells with high connectivity together. And then, do a k-layer partitioning to initialize the locations of all the gates. After the initial partitioning phase, repeat k-layer partitioning and uncoarsening to refine the total number of cuts required until the uncoarsening steps stop.
Note that this framework performs one multilevel iteration to decide which layer all of the gates should be placed. It reduces only the total number of TSVs without considering the maximum number of TSVs among all of the adjacent layers. Although this work is sensitive to the 3D structure, it is not good enough to find the solution.
Therefore, in this thesis, we propose an iterative layer-aware partitioning algorithm, named iLap, for TSV minimization in 3D ICs. Unlike [18]–[20], iLap merely identifies a layer at each iteration, i.e., iLap is iterative and gradually produces the final result layer by layer. Also unlike [22]–[26], which perform layer-unaware partitioning then layering, iLap applies layer-aware partitioning at each iteration. Though iLap also utilizes min-cut partitioning as the kernel of its engine, the experiment results demonstrate that iLap can apparently do better TSV minimization than three other hMetis-based and the multilevel multilayer partitioning methods for various number of layers and the required runtime is just within few seconds.
Moreover, in addition to TSV minimization, iLap can also distribute TSVs among layers more evenly than other existing arts. This feature is considered a big plus in design flows for ASIC as well as for other 3D regular logic structures (e.g., 3D FPGAs).