3.3 Statistical Electro-Thermal Analyzer
3.3.5 Mixed-Mesh Thermal Yield Estimation
As mentioned in Sections 3.3.1 and 3.3.2, the developed statistical expression generators need to solve several deterministic heat transfer equations to obtain the statistical expressions of the on-chip temperature distribution. Although our results show that the Level-1 Smolyak sparse grid formula of the stochastic collocation based method and the second order H-PC expression with-out the cross product terms of the stochastic projection based method can obtain the accurate statistical expressions of on-chip temperature distribution, we still need to solve 2×(NL+Ntox)+1 deterministic heat transfer equations. Therefore, the runtime of the thermal yield estimation is dominated by the statistical expression generators. To save the runtime for feeding the ther-mal yield to be the therther-mal cost of therther-mal-aware optimization engines, such as therther-mal-aware
floorplanners or placers, a mixed-mesh strategy is proposed to estimate on-chip thermal yield profile under an allowable temperature resolution, Tres.
The mixed-mesh strategy is inspired by the following observations. The developed statis-tical polynomial expression generator stated in Section 3.3.1/3.3.2 consists of a deterministic thermal simulation for calculating the mean/nominal temperature profile and 2 × (NL + Ntox) deterministic thermal simulations for calculating the variations of the temperature distribution.
Practically, since the process variations of parameters are usually within a controllable range, the mean/nominal value of the circuit performance is larger than the values of variance and skewness of the circuit performance [6, 7, 76, 90, 92, 94]. Since the mean is the PDF/CDF loca-tion parameter of the temperature at a specific posiloca-tion of die, it contributes the major porloca-tion to the value of thermal yield.
Based on the above observations, the mixed-mesh strategy for generating the statistical poly-nomial expression of temperature distribution is exhibited in Figure 3.16. For preserving the estimation accuracy, a fine-mesh deterministic thermal simulation is performed to obtain the mean/nominal temperature profile. Then, the difference ∆Tmaxbetween the maximum and min-imum temperatures for the mean/nominal temperature profile is extracted, and a temperature resolution Tresis chosen. Then, a NCMby NCMcoarse-mesh is utilized for executing the remain-ing NPC − 1 or NH − 1 deterministic thermal simulations. Here, NCM can be calculated using the criterionl∆Tmax/Tresm
. After that, using the statistical polynomial expression generated by these NPC − 1 or NH − 1 coarse-mesh temperature simulations, the coarse-mesh variance and skewness profiles of temperature distribution are obtained. Finally, the thermal yield profile is calculated by using the mixed-mesh mean/nominal, variance and skewness profiles of tempera-ture distribution.
With the above mixed-mesh strategy, the complexity of statistical polynomial expression can be significant reduced. For example, in our implementation, the deterministic thermal simulator stated in Chapter 2 is employed to calculate the deterministic temperature profile. The com-plexity of the baseline algorithm stated in Sections 3.3.1 is NPCNFMNFMO(log NBase), and the complexity of the mixed-mesh strategy is (NFMNFM+ (NPC − 1)NCMNCM) O(log NBase). Here, NFM is the number of girds in x- and y-directions for the fine-mesh, and NBase is the number of
Table 3.3: Parameters and Truncation Points for the Channel Length and the Oxide Thickness.
Nominal L Nominal tox 3σL 3σtox NL Ntox NKLg
65nm 1.5nm 12% 5% 13 13 49
bases for expressing the deterministic temperature profile.
The complexity ratio of the mixed-mesh strategy to the deterministic thermal simulation is (1+(NPC− 1) (NCM/NFM)2). In our experimental results, an accurate thermal yield profile can be estimated with the setting NPC = 53, NFM = 128, NCM= 16 and Tres = 0.65oC. The complexity ratio is 1.8125. Therefore, the mixed-mesh strategy enhances the efficiency of the thermal yield profile estimator for catching up with those of deterministic thermal simulators.
3.4 Experimental Results
The developed stochastic projection based thermal analyzer and the stochastic collocation based thermal analyzer are implemented in C++ language and tested on a Linux system with Intel Xeon 3.0-GHz CPU and 32GB memory. The die size is 2.5mm × 2.5mm × 0.5mm. The junction depth is set to be 20nm that is the nominal value for the 65nm technology [73] and the Debye length is set to be 2nm [102]. The floorplanning of test chip having 1.2 million functional gates is shown as Figure 3.17(a), and the geometries of chip and package are shown in Figure 3.17(b).
The device parameters, the truncation points of KL expansions for the channel length (NL) and the oxide thickness (Ntox), and the number of device modeling grid (NKLg) are summa-rized in Table 3.3. Both NL and Ntox are decided by satisfying γNL+1/ PNi=1L+1γi ≤ 1% and γNtox+1/ PNi=1tox+1γi ≤ 1%, respectively. To model the spatial correlation, both ηx/Lxand ηy/Lyare set to 0.98 for the correlation function shown in equation (3.7) [81].
By applying the modeling skill of thermal parameter mention in Figure 3.4 of Section 3.2.3 and the modeling skill for both of the heat transfer paths mentioned in [57], the thermal con-ductivity and the equivalent heat transfer coefficients of the primary and secondary heat flow paths for executing the deterministic simulator stated in Chapter 2 are summarized in Table 3.4.
The boundary condition of each vertical surface is set to be isothermal. The top surface of the test circuit is divided into 128 × 128 grids for executing the deterministic thermal simulator.
Power Source Layer of Die Interconnect Layer C4/CBGA Package and PCB Board
Die
20nm
0.5mm 0.06mm
(a) (b)
(c) (d)
Figure 3.17: Floorplan of the test die, geometries of the test chip and package, and mean and standard deviation profiles of the power density on the test chip. (a) Floorplan of the test die.
(b) Geometries of the test chip and package. (c) The mean profile of power density. (d) The standard deviation profile of power density. Here, Lxand Lyare the width and length of the test chip, respectively.
Table 3.4: Equivalent Thermal Parameters.
Parameter Value
κ 104.6 W/(m·°C)
hp 12000 W/(m2·°C)
hs 2017 W/(m2·°C)
κ: the thermal conductivity of the die.
hp: the equivalent primary heat transfer coefficient.
hs: the equivalent secondary heat transfer coefficient.
Table 3.5: Accuracy and Efficiency of the Developed Statistical Expression Generators.
Monte Carlo‡ Statistical Expression Generator† Speedup
Device Variation
Method Projection Based Collocation Based Projection Collocation
#Samples Runtime Maximum Error Runtime Maximum Error Runtime Based Based WIDWID+D2D D2D
WID+D2D ¬ Mean STDEV Mean STDEV ® ¬/ ¬/®
40% 60% 6921 442.94 0.92% 2.69% 2.47s 0.91% 2.70% 2.68s 179.3× 165.2×
50% 50% 7011 448.70 0.93% 2.43% 2.42s 0.91% 2.68% 2.72s 185.4× 164.9×
60% 40% 7031 449.98 0.90% 2.53% 2.47s 0.90% 2.72% 2.74s 182.1× 164.2×
† The maximum error is obtained by comparing with the golden solution constructed by the Monte Carlo method using 2 × 105samples.
‡ To demonstrate the efficiency, here, the Monte Carlo method is simulated till achieving the same accuracy of standard deviation as the developed methods. The runtime does not include the time of parsing input that is performed only once for all above methods.
In this table, “STDEV” represents the standard deviation.
The estimated mean and standard deviation profiles of the power density under the settings of 60% of WID and 40% D2D variations to the total variation are shown in Figure 3.17(c)–(d), respectively.