Chapter 5 Experimental Results
5.4 Case Study
In this section, we use apex2 benchmark to observe the effective of four flows for placement and routing in experiment II as various number of layers. The number of layers is 1~8. The logic utilization is set to 75%.
Figure 45 shows the temperature profile at layer 1 of 4-layer design. It can be observed that our TherWare has the smoothest temperature profile significantly.
Figure 45. Temperature profile at layer 1 of 4-layer design
The maximum temperature is increased as number of layers increased in all flows as shown in Figure 46. Observably, our TherWare always has the lowest maximum temperature, and it is about 70℃ decreased at a 8-layer design. Similarly, as shown in Figure 47, the temperature deviation grows as number of layers increases in all flows and our TherWare has the lowest deviation also.
Figure 48 shows the delay at 1~8-layer design. The delay decreases as number of layers increases and becomes saturated, mainly because the benefit on shortening global interconnects for 3D designs.
To combine the tradeoff between temperature optimization and impact on timing, our TherWare framework obtains the most benefit to the temperature behavior with
39
less impact on critical delay, which is especially obvious in a 4-layer design.
Figure 46. Maximum temperature at 1~8-layer design.
Figure 47. Temperature deviation at 1~8-layer design.
Figure 48. Delay at 1~8-layer design.
25
40
Chapter 6 Conclusion
In thesis, we first develop a set of fine-grained thermal resistive models with different granularities for 3D FPGAs, named FG-8, FG-4 and FG-2, respectively.
Regarding the finest granularity model – FG-8 as a baseline, the FG-4 is not only accurate but also efficient. For FG-4, the root mean square error is less than 2.5%, and the maximum absolute difference is less than 3.9%. Compared with FG-8, FG-4 also obtains 99.8% correlation and achieves 7.3 times speedup in runtime.
Meanwhile, we also propose a thermal-aware backend framework for 3D FPGAs, named TherWare. Three guidelines are integrated in TherWare placement stage: i) power uniformity – keeping power uniformity between several tiles with placed CLB;
ii) heat dissipativity – letting the potentially hotter tiles can dissipate heat easily; iii) interconnect power – preventing increasing the interconnect power excessively. Our router allows non-timing-critical nets choosing longer paths with lower power consumption, and such an idea will help distribute power more uniform in 3D FPGA designs.
Table 5. Improvements of TherWare vs. different baseline.
The experimental results are summarized in Table 5. TherWare outperforms all related works on temperature, and it has only few overheads in delay and runtime. We conclude that TherWare is the most effective thermal-aware placement and routing for 3D FPGAs up to now.
Baseline Maximum temperature
Temperature deviation
Maximum temperature
gradient
Total power
Delay overhead
Runtime overhead
TPR 25.8% 41.6% 24.7% 7.6% 2.0% 4.2%
3D
MEANDER 16.6% 31.9% 13.3% -7.9% 0.8% 14.0%
Z-tile-P +
TherWare-R 13.1% 26.9% 11.8% 3.7% 0.5% -0.3%
41
Reference
[1] International Technology Roadmap for Semiconductor. Semiconductor Industry Association 2005–2010.
[2] A. W. Topol, D. C. La Tulipe, L. Shi, D. J. Frank, K. Bernstein, S. E. Steen, A.
Kumar, G. U. Singco, A. M. Young, K. W. Guarini, and M. Ieong,
“Three-dimensional integrated circuits,” IBM J. of Research and Development, vol. 50, no. 4.5, pp. 491–506, Jul. 2006.
[3] K. Banerjee, S. J. Souri, P. Kapur, and K. C. Saraswat, “3-D ICs: a novel chip design for improving deep submicron interconnect performance and systems-on-chip integration,” Proc. IEEE, vol. 89, no. 5, pp. 602–633, May 2001.
[4] R. Tummala and V. Madisetti, “System on chip or system on package?” IEEE Design & Test of Computers, vol. 16, no. 2, pp. 48–56, Apr.–Jun. 1999.
[5] P. H. Shiu and K. S. Lim, “Multi-layer floorplanning for reliable system-on-package,” Proc. Int’l Symp. Circuits and System, pp. 23–26, 2004.
[6] S. Spiesshoefer, Z. Rahman, G. Vangara, S. Polamreddy, S. Burkett, and L.
Schaper, “Process integration for through-silicon vias,” J. of Vacuum Science and Technology A, vol. 23, no. 4, pp. 824–829, Jul. 2005.
[7] SOCcentral. [Online]. Available: http://www.soccentral.com
[8] S. Das, A. P. Chandrakasan, and R. Reif, “Calibration of rent's rule models for three-dimensional integrated circuits,” IEEE Trans. Very Large Scale Integration Systems, vol. 12, no. 4, pp. 359–366, Apr. 2004.
[9] A. Rahman and R. Reif, “System-level performance evaluation of three-dimensional integrated circuits,” IEEE Trans. Very Large Scale Integration Systems, vol.8, no.6, pp. 671–678, Dec. 2000.
[10] S. Das, A. Fan, K. Chen, C. S. Tan, N. Checka, and R. Reif, “Technology, performance, and computer-aided design of three-dimensional integrated circuits,”
Proc. Int’l Symp. Physical Design, pp. 108–115, 2004.
[11] I. Kaya, S. Salewski, M. Olbrich, and E. Barke, “Wirelength reduction using 3D physical design,” Int’l Workshop Integrated Circuit System Design, pp. 453–462, 2004.
[12] W. R. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A.M. Sule, M. Steer, and P. D. Franzon, “Demystifying 3D ICs: the pros and cons of going vertical,”
IEEE Design & Test of Computers, vol. 22, no. 6, pp. 498–510, Nov.–Dec. 2005.
[13] I. Loi, S. Mitra, T. H. Lee, S. Fujita, and L. Benini, “A low-overhead fault
42
tolerance scheme for TSV-based 3D network on chip links,” Proc. Int’l Conf.
Computer-Aided Design, pp. 598–602, 2008.
[14] C. Ababei, H. Mogal, and K. Bazargan, “Three-dimensional place and route for FPGAs,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 6, pp. 1132–1140, Jun. 2006.
[15] S. Im and K. Banerjee, “Full chip thermal analysis of planar (2D) and vertically integrated (3D) high performance ICs,” Technical Dig. Int’l Electron Devices Meeting, pp. 727–730, 2000.
[16] T. Y. Chiang, S. J. Souri, C. O. Chui, and K. C. Saraswat, “Thermal analysis of heterogeneous 3D ICs with various integration scenarios,” Technical Dig. Int’l Electron Devices Meeting, pp. 681–684, 2001.
[17] V. Betz, J. Rose, and A. Marquardt, Architecture and CAD for deep-submicron FPGAs, Kluwer Academic Publishers, 1999.
[18] P. Wilkerson, A. Raman, and M. Turowski, “Fast, automated thermal simulation of three-dimensional integrated circuits,” Int’l Society Conf. on Thermal Phenomena, vol. 1, pp. 706–713, Jun. 2004.
[19] W. Huang, “HotSpot - A chip and package compact thermal modeling methodology for VLSI design,” PhD Thesis, ECE, University of Virginia, 2007.
[20] C.-I Chen, B.-C. Lee, and J.-D. Huang, “Architectural exploration of 3D FPGAs towards a better balance between area and delay,” Proc. Design, Automation &
Test in Europe Conf. and Exhibit., pp. 587–590, 2011.
[21] J. Jaffari and M. Anis, “Thermal-aware placement for FPGAs using electrostatic charge model,” Proc. Int’l Symp. on Quality Electronic Design, pp. 666–671, 2007.
[22] S. Im, N. Srivastava, K. Banerjee, and K. E. Goodson, “Thermal scaling analysis of multilevel Cu/Low-k interconnect structures in deep nanometer scale technologies,” Proc. Int’l VLSI Multilevel Interconnect Conf., pp. 525–530, 2005.
[23] K. Siozios, V. F. Pavlidis, and D. Soudris, “A software-supported methodology for exploring interconnection architectures targeting 3D FPGAs,” Proc. Design, Automation & Test in Europe Conf. and Exhibit., pp. 172–177, 2009.
[24] J. Cong, J. Wei, and Y. Zhang, “A thermal-driven floorplanning algorithm for 3D ICs,” Proc. Int’l Conf. Computer-Aided Design, pp. 306–313, 2004.
[25] F. Li, D. Chen, L. He, and J. Cong, “Architecture evaluation for power efficient FPGAs,” Proc. Int’l Symp. on Field Programmable Gate Arrays, pp. 175–184, 2003.
[26] F. Li, Y. Lin, L. He, D. Chen, and J. Cong, “Power modeling and characteristics
43
of field programmable gate arrays,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 11, pp. 1712–1724, Nov. 2005.
[27] L. McMurchie, and C. Ebeling, “Pathfinder: A Negotiation-Based Performance-Driven Router for FPGAs,” Proc. Int. Sym. on Field-Programmable Gate Arrays, pp.111–117, 1995.
[28] Altera. [Online]. Available: http://www.altera.com/
[29] Xilinx. [Online]. Available: http://www.xilinx.com/
[30] S. Yang, “Logic synthesis and optimization benchmarks user guide,” Technical Report 1991-IWLS-UG-Saeyang, Microelectronics Center of North Carolina, 1991.
[31] Y-S. Huang, Y.-H. Liu, and J.-D. Huang, “Layer-Aware Design Partitioning for Vertical Interconnect Minimization,” Proc. IEEE Computer Society Annual Symp. on VLSI, pp. 144–149, 2011.
[32] Kara K. W. Poon, Steven J. E. Wilton , and Andy Yan, “A detailed power model for field-programmable gate arrays,” ACM Trans. on Design Automation of Electronic Systems, vol. 10, no. 2, pp. 279–302, Apr. 2005.