3.3 Statistical Electro-Thermal Analyzer
3.3.2 Stochastic Collocation Based Statistical Expression Generator
Smolyak Sparse Grid Formulation
The primary advantage of Smolyak sparse grid formulation is to construct an interpolating polynomial of the multivariate function u ∈ Cr by using much less samples of the desired function than those of the full tensor product interpolation formula and the Monte Carlo method but still maintains an acceptable error bound [96, 97]. Here, Cris the set of all functions which have continuous derivatives of all orders up to r. With the stochastic collocation technique, the statistical expression of the on-chip temperature distribution can be efficiently constructed.
The difference between Monte Carlo method and Smolyak sparse grid formulation is that the Monte Carlo method randomly generates the samples of random variables and, hence, requires a large number of samples for achieving an accurate estimate. On contrary to the Monte Carlo Method, the Smolyak sparse grid technique uses the roots of H-PCs or the extrema of Cheby-shev polynomial [97] to generate the samples of random variables and employs these fewer samples to effectively interpolate the desired solution. For a two-dimensional random variable, its possible sample sets of the Monte Carlo method and the Smolyak sparse grid formulation are illustrated in Figure 3.11.
According to the Smolyak sparse grid formulation, the on-chip temperature distribution can
Sampling Points of Monte Carlo
Sampling Points of Smolyak Sparse Grid
ξ
1ξ
2ξ
1ξ
2Figure 3.11: The number of sampling random variables comparison between the Monte Carlo method and the Smolyak sparse grid formulation. Here, the samples of Smolyak sparse grid are adopted for achieving a level two approximation.
be explicitly approximated as follows [96, 97].
TbqNKL(r, ξ)= X
q−NKL+1≤|i|≤q
(−1)q−|i| NKL−1 q − |i|
!
Qi1(T ) ⊗ · · · ⊗ QiNKL(T ) . (3.39)
Here, NKL = Ntox + NL is the number of random variables in ξ, q = NKL + l, l ≥ 1 is the formulation level, and |i| = i1+ · · · + in + · · · + iNKL. With level in ≥ 1, Qin is an interpolating polynomial of T (r, ξ) by only utilizing the random variable ξn, and ⊗ is the functional cross product. The level in is the index to decide the number of samples (min) for the interpolating polynomial Qin. As suggested by [97], the relation between min and in is that m1 = 1 and min = 2in−1+ 1 for in > 1.
From (3.39), only the corresponding temperature values of a small set of samples for ξ [97]
need to be known. This set is called the sparse grid and is equal to [97]
H(q, NKL)= [
q−NKL+1≤|i|≤q
~i1 × · · · × ~in × · · · × ~iNKL, (3.40)
where ~in =nξi1
n, · · · , ξmi in
n
ois the set of sample points used by Qin(T ), and the operator ‘×’ is the cross product of sets. The number of sample points from the Smolyak sparse grid formulation
Cdet· O
NlKL/l!
. Here, Cdetis the runtime complexity for performing the deterministic electro-thermal simulation once.
For a function having bounded derivatives up to order r, the Smolyak sparse grid formulation ensures a error bound, |El| = cNKL,r· NH−r · log NH(r+1)(NKL−1)
[97]. Here, NH is the number of sample points in H (q, NKL), and cNKL,r is a constant that only depends on NKL and r. In our experience, the accurate estimation of thermal yield profile can be obtained by setting the level lto be 1. Therefore, the number of sample points in the Smolyak sparse grid formulation can be much less than that of the Monte Carlo method.
An example with NKL = 2 and q = NKL+ 1 = 3 is given to illustrate the Smolyak sparse grid formulation. Since q − NKL + 1 ≤ |i| ≤ q, we have i1 = 1, i2 = 1 for |i| = 2, and i1 = 1, i2 = 2 or i1 = 2, i2 = 1 for |i| = 3. Therefore, the numbers of sample values for random variables ξ1 and ξ2 are mi1=1 = 1, mi2=1 = 1 for |i| = 2, and mi1=1 = 1, mi2=2 = 3 or mi1=2 = 3, mi2=1 = 1 for |i| = 3, respectively. According to various values of i1 and i2, the interpolating polynomial forms by individually utilizing each random variable at different levels can be determined. After that, the interpolating polynomial forms corresponding to ξT = [ξ1, ξ2] at different combined levels (i1, i2) can be constructed by the functional cross product distribution excited by the point that belongs to the following sample set of ξ needs to be known.
Given ~1 = {p10} and ~2 = {p20, p21, p22}, we have The sampling values of ~i for each level i must be properly decided. Adopting the roots of H-PCs with its order being corresponding to the level i can achieve the most accurate result as
Algorithm Temperature Profile Calculation for a Sample Point Input: A sampling point ξj, initial temperature Tξinij and pd(r) Output: Temperature profile T (r, ξj)
1 Begin
†Any deterministic thermal simulators can be used to execute Line 9.
Here, the simulator stated in Chapter 2 is adopted.
Figure 3.12: Deterministic electro-thermal analysis for each sampling point, ξj, in sparse grid.
pleak, pd and p are the leakage, dynamic and total power density profiles for each sampling point, respectively.
ξ is a normal random vector [98]. Choosing the extrema of the Chebyshev polynomial with its order being corresponding to the level i can achieve the nested sparse grid structure, i.e. ~i ⊂ ~k for i < k, for any levels and the acceptable accuracy [97]. In this work, we select the roots of H-PCs as the sampling values since the result is shown to be very accurate by using the low level approximation, and the nested sparse grid structure is still preserved for q= NKL+ 16. Temperature Profile Calculation for a Given Sample Point
After the sparse grid H (q, NKL) of ξ is obtained, the samples of channel length and oxide thickness in the m-th parameter modeling grid corresponding to the j-th sample point, ξj, of H (q, NKL) can be obtained by equations (3.15) and (3.16). Hence, the deterministic power den-sity profile corresponding to ξj can be obtained. With the deterministic power density profile,
6If the high order approximation is needed for the accuracy, we suggest to use the extrema of the Chebyshev polynomial because the nested sparse grid structure is preserved for any levels; hence, the number of sample points can be much less.
we have the following deterministic steady-state heat transfer equation
κ∇2T(r, ξj)= −p(r, ξj, T ), (3.42)
subject to the following boundary condition κ∂T (rbs, ξj)
∂~nbs
+ hbsT(rbs, ξj)= fbs(rbs). (3.43)
Here, p(r, ξj, T ) and T (r, ξj) are the deterministic power density and temperature profiles with respect to ξj, respectively. Since the power density profile in equation (3.42) is temperature dependent, a deterministic electro-thermal analysis procedure summarized in Figure 3.12 is built to obtain each T (r, ξj).
Temperature Profile Construction by Using Polynomial Interpolation
Instead of directly using equation (3.39) to obtain Qi1(T ) ⊗ · · · ⊗ QiNKL(T ) for each different
|i| = i1 + · · · + iNKL, we take the advantage of nested sparse grid structure and then perform the Newton interpolating method [98] to globally interpolate T (r, ξ).7 Based on the Newton interpolating formula, the approximated on-chip temperature at a specified position of the die, T(r∗, ξ), can be expressed as
Tb(r∗, ξ) =
j=NH−1
X
j=0
ˆuj(r∗)φj(ξ). (3.44)
Here, each φj(ξ) is an interpolating polynomial with respect to the j-th sampling vector ξj, and the form of each φj(ξ) can be found in [98]. NH = |H(q, NKL)| and |H (q, NKL)| is the number of the sampling vectors in sparse grid. Each ˆuj(r∗) is an unknown coefficient which needs to be determined.
Based on the basic idea of interpolation that the approximation function must match each known data, the interpolated polynomial in (3.44) satisfies the following equation for each ξn.
j=NH−1
X
j=0
ˆuj(r∗)φj(ξn)= T(r∗, ξn). (3.45)
7For the sparse grid that does not preserve the nested structure, the Newton interpolating method can also be applied to obtain each Qi1(T ) ⊗ · · · ⊗ QiNKL(T ).
Algorithm Stochastic Collocation Based Electro-thermal Analysis
Input: Geometries of the die; spatial correlation models of device channel length and oxide thickness; design informations such as .def,
.lef, and .lib files; package structure and leakage power models Output: Mean profile, variance profile, and the Smolyak sparse grid
interpolation formula, bT(r, ξ), of on-chip temperature distribution 1 Begin
2 Set thermal parameters and the initial average mean temperature, µiniT , of the die by 1-D thermal model;
3 For m ← 1 to Ng
4 Obtain gLm and gtoxm of Lmand toxm by the KL expansion, respectively;
5 EndFor
6 Generate the Smolyak sparse grid, H (q, NKL), for the KL expanded random variables.
7 For n ← 0 to |H (q, NKL)| − 1
8 Obtain T (r, ξn) by using the algorithm shown in Figure 3.12.
9 EndFor
10 Solve equation (3.46) to obtain the Newton interpolation formula in equation (3.44), and calculate the mean and variance profiles.
11 End
Figure 3.13: Stochastic Collocation Based Statistical Expression Generating Algorithm.
With the property of φj(ξ) described in [98], equation (3.45) can be rewritten as the following matrix form for finding each ˆuj(r∗) at the chip position r∗.
Each ˆuj(r∗) can be calculated by using the forward substitution. After each ˆuj(r∗) is calculated, the mean and variance profiles of the temperature distribution can be estimated as
En
The algorithm of the developed stochastic collocation based statistical expression generator is shown in Figure 3.13.
Gates Placement
Power density in a grid is obtained by summing the product of the accumulated area of same type of gates and the power value of that type with updated
Figure 3.14: Implementation of solving the deterministic heat transfer equations.