Results and Analyses - iLap: 三維積體電路上減少直通矽穿孔數目之迭代式層級感知分割演算法

Chapter 4 Experiments

4.2 Results and Analyses

4.2.1 Analysis of TSV Count with Fixed Layer

Table 2 reports the TSV demands as the number of layers in a 3D IC is set to 4. It seems EV-matrix just performs equally well as plain hMetis. Meanwhile, given a set of 4 partitions generated by hMetis, EX-hMetis always picks the one with the lowest TSV count out of 4! = 24 different permutations (i.e., layer assignments) and consequently EX-hMetis on average attains 16% TSV reduction as compared with hMetis. Furthermore, MKLP shows its layer-aware advantage on the three largest cases about 70% reduction. Nevertheless, it’s not reliable in others. For iLap-2, it performs as well as EX-hMetis on average. Besides, it is more efficient than EX-hMetis in the larger cases. Due to considering the overall situation and predicting the distribution in the future during the framework, iLap-k improves iLap-2 about 19% on average and reduces TSV count by 36% and 24% as compared to hMetis and EX-hMetis, respectively. Moreover, for the largest three test cases (cfft, aqua, and video), iLap-k even outperforms hMetis by more than 75%. Though hMetis is an excellent multi-way min-cut partitioning algorithm, it fails to be a good 3D partitioner due to its layer-unawareness. Even EX-hMetis with exhaustive permutations still cannot defeat iLap-k.

Compared with MKLP, iLap framework dynamically and iteratively selects the partial best to construct the better result. Therefore, it concludes that a dedicated layer-aware 3D partitioning algorithm, like iLap, should be regarded as one of the essential components while constructing a sophisticated 3D IC design environment.

Table 2. Total number of TSVs when k = 4.

If we experience in 8 layers as Table 3 shows, EV-Matrix cannot always improve the result of hMetis since it weakly solves the hyperedge problem. In hMetis-based algorithms, EX-hMetis is also the best solution. For MKLP, it can not find a good solution when the small circuits are partitioned into high number of layers. However, it does well on the largest three cases. Compared with other methods, iLap-2 and iLap-k are still demanded the least number of TSVs.

Table 3. Total number of TSVs when k = 8.

4.2.2 TSV Count

Figure 24 depicts the average TSV count over 14 test cases as a function of the number of layers; and three points are worth pointing out. Firstly, the more layers a design gets partitioned into, the more TSVs it generally requires. Secondly, iLap-k and iLap-2 are the all-time winner from 2 layers to 10 layers among all methods. Thirdly, unlike the layer-unaware methods, the number of TSVs required by layer-aware algorithms raises very smoothly as the number of layers increases.

Taking hMetis as the baseline, Figure 25 reveals the average TSV ratios over the number of layers; and three points are worth pointing out here. Firstly, iLap-k constantly and steadily outperforms hMetis by about 33% in TSV size regardless of the number of layers. Secondly, from the curve we can see that iLap-2 is nearly equal to EX-hMetis. Finally, EX-hMetis is always outperforms hMetis, as expected.

Figure 24. The number of required TSVs in 3D ICs.

Figure 25. Normalized TSV count.

4.2.3 Distribution of TSV Count

Meanwhile, Figure 26 presents the average standard deviations of TSV count over a different number of layers. It is evident that the standard deviation of TSV count associated with iLap-k and iLap-2 are more stable than the others. As previously mentioned, a TSV occupies significant silicon estate so that high standard deviation of TSV count potentially worsens area size imbalance among individual layers and even lowers the yield of a design.

Figure 27 reports the average maximum TSV count at some junction of a design over a different number of layers; and iLap-k and iLap-2 always possess the lowest values regardless of the number of layers. For some 3D logic structures, like 3D FPGAs, the number of pre-fabricated inter-layer TSVs is fixed. Hence the design mapping is considered a failure even if the required TSVs exceeds the provided ones only at one junction, and a high maximum TSV count potentially increases such chances.

Figure 26. Standard deviation of TSV count.

Figure 27. Maximum TSV

4.2.4 Runtimes

Regarding the runtime efficiency issue, Figure 28 gives the average runtime of 14 test cases in second over a different number of layers. It is evident that both hMetis and EV-matrix are very time-efficient. The runtime required by iLap-k and iLap-2 grow linearly as the number of layers increases. This trend is natural since the number of invocations for multi-way partitioning inside iLap-based algorithms also grow linearly as the number of layers increases.

Since the iterative engine in iLap-2 is a simple 2-way partitioning, it is faster than MKLP when the number of layers increases. Even the engine in iLap-k is more complex, only a few seconds needed to complete the algorithm. Hence, given the excellent performance in TSV minimization, the time complexity of iLap-k should be acceptable. As for EX-hMetis, since it has to check all possible permutations to find the best one, the required runtime is thus exponential to the number of layers. It is no wonder why the runtime increases drastically as the number of layers exceeds 8. Even though using some skills to improve the runtime of EX-hMetis, it still cannot improve the quality.

Figure 28. Runtimes of the experiments.

在文檔中 iLap: 三維積體電路上減少直通矽穿孔數目之迭代式層級感知分割演算法 (頁 37-43)