3.3 Statistical Electro-Thermal Analyzer
3.4.2 Accuracy and E fficiency
Given various ratios of WID variation and D2D variation to the total variation and 100 grids for modeling device parameters, the results of Monte Carlo method with 2 × 105samples are used as the reference (golden) solution of statistical electro-thermal simulation.
The Level-1 Smolyak sparse grid formula that chooses the roots of H-PCs to be the
sam-(a) (b)
(c) (d)
Figure 3.18: Results of the Monte Carlo method with or without considering electro-thermal effects. (a) The mean temperature profile with considering the electro-thermal effect. (b) The mean temperature profile without considering the electro-thermal effect. (c) The standard de-viation profile of temperature distribution with considering the electro-thermal effect. (d) The standard deviation profile of temperature distribution without considering the electro-thermal effect.
between the stochastic projection based method and the stochastic collocation based method, the temperature distribution is expanded by the second order H-PCs without the cross product terms for the stochastic projection based method since it generate the similar form as that of the Level-1 Smolyak sparse grid formula. The number of executing the deterministic electro-thermal simulation is 53 for both of the stochastic projection based and stochastic collocation based methods because the stochastic projection based method generates 2 × (NL+ Ntox)+ 1 H-PCs to approximate the temperature distribution, and the Level-1 Smolyak sparse grid formula uses 2 × (NL+ Ntox)+ 1 sampling points. Both NLand Ntox are 13 as shown in Table 3.3.
In Table 3.5, the first two columns are the ratios of WID variation and D2D variation to the total variation, respectively. As shown in Table 3.5, the maximum errors of two proposed statistical expression generators are less than 3% for both estimated mean and standard deviation profiles among three different ratios of WID variation and D2D variation.
As shown in Table 3.5, the runtime of the stochastic collocation based method is larger than that of the stochastic projection based method because the required number of iterations for solving the deterministic electro-thermal problem of the stochastic collocation based method is larger; especially for those samples hitting the (µ − 3σ) value of each KL expanded random variable. The “Speedup” indicates the speedup of each developed method over the Monte Carlo method. The speedup of the stochastic projection method and the stochastic collocation method are over 179× and 164×, respectively. The results show that the proposed methods can be orders of magnitude faster than the Monte Carlo method.
The mean and standard deviation profiles of the temperature distribution on the test chip with 60% of WID variation and 40% of D2D variation to the total variation are shown in Fig-ure 3.19(a)–(d). FigFig-ure 3.19(a) and (b) are the mean and standard deviation profiles estimated by the stochastic projection based method, respectively. Figure 3.19(c) and (d) are the mean and standard deviation profiles estimated by the stochastic collocation based method, respec-tively. The error distributions of the mean and standard deviation of the temperature distribution estimated by the stochastic projection based method are shown in Figure 3.19(e) and (f), respec-tively.
(a) (b)
(c) (d)
-0.8 -0.6 -0.4 -0.2 0
0 0.02 0.04 0.06 0.08 0.1
Error Comparing with Monte Carlo Method (%)
Error Proportion (%)
Error Distribution of Mean Profile for Projection Based Method
1.4 1.6 1.8 2 2.2 2.4
0 0.005 0.01 0.015 0.02 0.025
Error Comparing with Monte Carlo Method (%)
Error Proportion (%)
Error Distribution of Standard Deviation Profile for Projection Based Method
(e) (f)
Figure 3.19: Simulation results of the developed methods. (a) and (b) the mean and stan-dard deviation profiles of the estimated temperature distribution got by the stochastic projection method, respectively. (c) and (d) the mean and standard deviation profiles of the estimated tem-perature distribution obtained by the stochastic collocation method, respectively. (e) and (f) the error distributions of the mean and standard deviation of the estimated temperature distribution got by the stochastic projection method, respectively.
Table 3.6: Accuracy and Efficiency Comparison of the Skew Normal Model and APEX for Estimating Thermal Yield Profiles. The results are compared with the Monte Carlo method with 2 × 105samples.
Variation Skew Normal APEX
Tre f Projection Collocation Projection Collocation
WIDWID+D2D D2D
WID+D2D Runtime MaxError Runtime MaxError Runtime MaxError Runtime MaxError
40% 60% 88.40oC 0.013s 1.51% 0.013s 1.63% 2.80s 1.91% 2.80s 1.97%
50% 50% 88.48oC 0.013s 1.46% 0.013s 1.52% 2.80s 1.87% 2.80s 1.90%
60% 40% 88.54oC 0.013s 1.37% 0.013s 1.41% 2.80s 2.27% 2.80s 2.32%
Thermal Yield Estimation
Based on the second order H-PCs without the cross product terms for the stochastic projection based method and the Level-1 Symolyak sparse grid formula for the stochastic collocation based method, the skew-normal based statistical moment matching method and APEX [99] are imple-mented for estimating thermal yield profiles. To avoid the instability of the Pad`e approximation for APEX, the stable two pole model [100] is implemented for finding the poles/zeros. Based on the average mean ( ¯µT) and the average standard deviation ( ¯σT) of temperature obtained by the Monte Carlo method, the reference temperature (Tre f) specified by the designer is set to be
¯µT + 2.5 ¯σT. With various ratios of WID variation and D2D variation to the total variation, the results of the skew-normal based method and APEX for estimating thermal yield profiles are summarized in Table 3.6.
The “Projection” and “Collocation” indicate that the statistical expressions of the temper-ature distribution are generated by the stochastic projection method and stochastic collocation method, respectively. The “Runtime” is the execution time to obtain the thermal yield profile, and “MaxError” is the maximum error of the estimated thermal yield profile compared with the golden solution obtained by the Monte Carlo method. As shown in Table 3.6, both our statis-tical expression generators can provide accurate statisstatis-tical on-chip temperature expressions for the thermal yield estimation. The maximum error of the skew-normal based method is less than 1.63% for all test situations, and the maximum error of APEX is less than 2.32%. It can be observed that the accuracy of the skew-normal based methods outperforms that of APEX.
Furthermore, as shown in Table 3.6, the proposed skew-normal based method can achieve 215× speedup over APEX. It is because of two reasons. First, APEX needs a high order of sta-tistical moments to get a tight bound of their generalized Chebshev inequality for the PDF/CDF
achieve an accurate thermal yield profile even though [100] only needs the first four statisti-cal moments to get the first two dominated poles. Rather than APEX, the skew-normal based method only needs to match the first three statistical moments to construct the model and can accurately estimate the thermal yield profile. Second, after the first two dominated poles are computed, APEX needs to solve equations to obtain the zeros of the first two dominated poles for constructing its exponential model. Rather than APEX, the skew-normal based method only needs to perform a constant-time lookup-table method to estimate the thermal yield profile after the first three statistical moments have been computed.
With W ID
W ID+D2D, W IDD2D+D2D = (60%, 40%), results of the thermal yield profile estimation are shown in Figure 3.20. The thermal yield profile got by the Monte Carlo method is drawn in Figure 3.20(a). Figure 3.20(b) and (c) show the estimated thermal yield profiles of the proposed skew-normal based method and APEX, respectively. Comparing with the Monte Carlo method, the error distributions of the estimated thermal yield profiles of the proposed skew-normal based method and APEX are shown in Figure 3.21. Figure 3.21(a) and (b) are the error distributions of the proposed skew-normal based method and APEX, respectively. From Figure 3.20(a)–
(b) and Figure 3.21(a), it can be observed that the developed skew-normal based method can accurately deliver the on-chip thermal yield profile. However, Figure 3.20(c) reveals that the estimated thermal yield profile got by APEX exceeds 100% in some region since APEX doesn’t guarantee to generate a statistical model for preserving the property of CDF.
To further demonstrate that the skew-normal model based method can accurately estimate the temperature CDF at a position on the chip. Figure 3.22 plots the CDF curve of temperature at position A in Fig 3.20(a) got by the Monte Carlo method and its estimated CDF curves got by the skew-normal model bases method, and APEX with the 9-th order and the 4-th order for the PDF/CDF shifting process.
As shown in Figure 3.22, the estimated CDF curve got by the skew-normal model based method can tightly fit the CDF curve obtained by the Monte Carlo method. However, APEX with the 4-th order can not meet the result got by the Monte Carlo method. Although the accuracy of estimated CDF curve got by APEX can be improved by increasing the order to 9, it still cannot accurately estimate the thermal yield for a smaller reference temperature value as
(a)
(b)
(c)
Figure 3.20: Thermal yield profiles of the test chip with W ID
+D2D, D2D+D2D = (60%, 40%). (a)
-1 -0.5 0 0.5 1 0
0.02 0.04 0.06
Error Comparing with Monte Carlo Method (%)
Error Proportion (%)
Error Distribution of Thermal Yield Profile for Skew Normal Model
Error Comparing with Monte Carlo Method (%)
Error Proportion (%)
Error Distribution of Thermal Yield Profile for APEX
(a) (b)
Figure 3.21: The error distributions of the skew-normal based method and APEX. (a) tion of the skew-normal based method comparing with the Monte Carlo method. (b) Distribu-tion of APEX comparing with the Monte Carlo method.
70 80 90 100 110
Estimated CDFs of the Temperature at Point A in Figure 3.20(a)
Figure 3.22: The temperature CDF curve at position A in Figure 3.20(a) got by the Monte Carlo method, and its estimated CDF curves obtained by the skew-normal model based method, APEX with the 4-th oredr and the 9-th order for the PDF/CDF shifting process.
illustrated in Figure 3.22.
Mixed-Mesh Thermal Yield Estimation
The mixed-grid thermal yield estimation strategy presented in section 3.3.5 has been imple-mented into the statistical thermal expression generators to demonstrate its effectiveness. The estimated thermal yield profile of the test chip with the stochastic projection based statistical expression generator is shown in Figure 3.23. In this test case, the difference between the max-imum and minmax-imum mean temperatures, ∆Tmax, can be calculated as 11.1°Cwith the number of fine grid being 128 × 128, and the temperature resolution, Tres, is set to be 0.65°C. Hence, the number of coarse gird for the remaining NPC − 1 deterministic thermal simulations can be calculated as 16 × 16. Comparing with the result from the Monte Carlo method, the maximum error of the estimated thermal yield profile obtained by the mixed-grid strategy is only 2.24%
which is slightly larger than the result shown in Table 3.6. However, the runtime of building the statistical expression of on-chip temperature distribution can be reduced to 0.019 seconds (The runtime without using the mixed-grid strategy is 2.47 seconds as shown in Table 3.5.). Thus, the mixed-mesh strategy achieves 130× speedup over the baseline statistical polynomial expression generator. The runtime for estimating the thermal yield profile is still 0.013 seconds. Totally, the runtime for executing the entire flow of the mixed-mesh thermal yield estimation is 0.032s.
(a)
-2 -1.5 -1 -0.5 0
0 0.005 0.01 0.015 0.02 0.025 0.03
Error Comparing with Monte Carlo Method (%)
Error Proportion (%)
Error Distribution of Thermal Yield Profile for Mixed-Mesh Approximation
(b)
Figure 3.23: The estimated thermal yield profile and the error distribution of the mixed-grid thermal estimation strategy.
Chapter 4
Simulation Method III – LUTSim: A
Look-Up Table Based Thermal Simulator for 3-D ICs
In this chapter, a look-up table based thermal simulator, LUTSim, is presented to efficiently estimate the temperature profile of three-dimensional integrated circuits. With utilizing the pre-built tables of the temperature response induced by a unit power source, the superposition, interpolation, and a recursive table look-up techniques are applied to estimate the temperature profile of the three-dimensional integrated circuits.
Look-Up Table Based Thermal
Estimation Thermal-Aware
Floorplanner of 3-D ICs
TSVs/TTSVs Considered Thermal-Aware Placer
of 3-D ICs
Early Stage Optimization Kernel of Physical Design for 3-D ICs
Design Netlist
Design Constraints
Tier Information
Tech. Library (With TSVs)
Temperature Tables Library
Figure 4.1: Key points of LUTSim for the early physical design stages in 3-D ICs.
As shown in Figure 4.1, LUTSim is conceptually similar with the circuit performance anal-ysis, such timing and power analanal-ysis, using the standard cell library. For the thermal simulation (the circuit performance analysis), the thermal (electrical) characteristics of modeling grids (gates) are first pre-characterized by the detailed thermal simulation (the SPICE simulation), and the temperature profiles (delays/powers of gates) are tabled in the library files. With the pre-characterized tables of the temperature (electrical characteristics such as delays or powers), the thermal analysis (the circuit performance analysis such as the static timing analysis) can be efficiently performed via table look-up instead of executing the time-consuming detail thermal simulation (SPICE simulation).
With the framework of look-up table (LUT), LUTSim can efficiently calculate full-chip tem-perature profile without solving the large scale system of the modified nodal analysis (MNA), which is the major computation effort of the prior arts [50–59], of the equivalent thermal cir-cuit. More important, besides the advantage of the full-chip thermal simulation, if TSVs are moved by the optimization engines, this simulation method can update the on-chip tempera-ture by table looking-up without re-performing the dealing process of the large scale thermal conductance matrix.
The organization of this chapter is summarized as follows. The compact thermal models for early design stages of 3-D ICs is stated in section 4.1. Then, LUTSim is described in section 4.2.
Finally, the experimental results are given in section 4.3.
4.1 Thermal Model for Early Design Stages of TSVs based 3-D IC Structures
As mentioned in section 1.3, the TSVs based 3-D IC structures can provide much interconnect density, and is the most popular implementation categories. Therefore, this work focuses on thermal analysis of these structures of 3-D ICs. As exhibited in Figure 4.2, the thermal model for the early physical design stages of TSVs based 3-D ICs consists of following portions1.
1. The primary heat flow path consists of the heat spreader, heat sink and package. The secondary heat flow path consists of the input/output pads, the package substrate and the
1Although LUTSim adopts the thermal model of TSV based 3-D ICs, its framework can be extended to other structures of 3-D ICs, e.g. face-to-face, contactless interconnection and wire-bound structures [3].
print circuit board. Using the techniques stated in section 2.1, the heat transfer coefficients of the primary and secondary heat flow paths can be equalized to two different effective heat transfer coefficients hpand hs, respectively.
2. Interconnect layers consists of the interconnects and the dielectric. Because the routing information is unknown in the early physical design stages, each interconnect layer can be modeled as a homogeneous layer with an effective thermal resistance or conductiv-ity using the modeling techniques in [56] with the empirical densconductiv-ity and the regularconductiv-ity structure assumption of wires.
3. Functional blocks of tiers are modeled as power sources attached to the thin layers that are close to the top surfaces of the silicon bulk and the stacked silicon substrates, and there are TSVs in each silicon substrate of each stacked tier, e.g. tiers 2 and 3 in Figure 4.2.
Tier 1 Tier 2 Tier 3
Ambient Air
Primary Heat Flow Path Silicon Bulk Interconnect Layer Interconnect Layer Interconnect Layer
TSV
Device Secondary Heat Flow Path
h p
Silicon Substrate
Silicon Substrate (x,y) = (0,0)
(x,y) = (L
x,0)
h s
Ambient Air (x,y) = (L
x, L
y)
(x,y) = (0 , L
y)
z = L
zz = 0
Figure 4.2: Thermal model for the early design stage of a 3-D IC with three tiers.
As stated previously, this work focuses on the steady-state thermal analysis because the
above thermal model, the temperature profile of a TSVs based 3-D IC, T (r), can be governed by the following steady-state heat transfer equation.
∇ ·(κ(r)∇T (r)) = −p(r), (4.1)
subject to the boundary condition as κ(rbs)∂T (rbs)
∂~nbs
+ hbsT(rbs)= fbs(rbs). (4.2)
Here, r = (x, y, z) ∈ D, D = (0, Lx) × (0, Ly) × (0, Lz) is the domain of the chip, Lx and Ly
are the lateral sizes of the chip, Lzis the thickness of the chip, κ(r) is the thermal conductivity (W/m·◦C) of the chip, and ∇ is the diverge operator. The bs is any specific boundary surfaces of the chip, rbs is the position on bs, hbs is the heat-transfer coefficient on bs, fbs(rbs) is the heat flux function on bs, ~nbs is the outward normal to bs, ∂/∂~nbs denotes the differentiation along the outward normal to bs, and p(r) is the power density profile of the chip. Since the major portion of device current flows through the channel, p(r) has its value only when r is in the thin layers close to the top surfaces of tiers. The thicknesses of these thin layers are equal to the junction depth of devices.
Although the steady state heat transfer equations shown in equations (4.1) and (4.2) are sim-ilar with the steady state heat transfer equations for 2-D ICs stated in section 2.1, the difference is that the thermal conductivity profile κ(r) in the silicon substrates and bulk can not be treated as a constant value because there are TSVs these layers. Therefore, the analytical simulation framework stated in Chapter 2 should be modified for the TSV based 3-D ICs. Nevertheless, the author refers the above modification to be an open research topic. Instead of the analytical simulation approaches, using the finite difference method (FDM), the steady-state heat transfer equations (4.1) and (4.2) can be transformed into a SPICE-compatible equivalent thermal cir-cuit [50], and the steady-state temperature profile of a 3-D IC can be obtained by solving the following modified nodal analysis (MNA) system.
GT= p. (4.3)
Here, G is the thermal conductance matrix, and T is the temperature profile vector of simulation grids. p is the power vector of modeling grids, its entries have non-zero values only for grids in
Sg, and Sgis the set of grids close to the top surfaces of tiers. With equation (4.3), our target is to estimate the temperature profile of the grids in Sgbecause temperatures of functional blocks are required to be well concerned in the early physical design stages.
Once the thermal conductance matrix of the TSV based 3-D IC is constructed, advanced numerical simulation framework such as [51, 54, 55] can be adopted to solve the temperature profile in simulation grids, T. However, since the positions of TSVs will be moved by early stage design engines such as floorplanners or placers, the thermal conductance matrix, G, will be different after an optimization step is executed. Therefore, the handling process2of the ther-mal conductance matrix needs to be re-performed for each optimization loop, and this decreases the efficiency of [51,54,55]. Therefore, to avoid the re-handling process of the thermal conduc-tance while thermal-aware design engines are executing, LUTSim employs the look-up table framework to simulate the temperature profile of 3-D ICs.
With equation (4.3), our target is to estimate the temperature profile of grids in Sg, TSg, because temperature values of functional blocks are required to be well concerned in early physical design stages.