考慮晶片上具有空間相關製程變異的統計型晶片熱分佈模擬器

(1)

國

立

交

通

大

學

電信工程學系

碩士論文

考慮晶片上具有空間相關製程變異的

統計型晶片熱分佈模擬器

Stochastic Thermal Simulator Considering Within-die

Spatial Correlation under Process Variations

研究生：吳佳鴻

指導教授：李育民教授

(2)

考慮晶片上具有空間相關製程變異的

統計型晶片熱分佈模擬器

學生:吳佳鴻指導教授:李育民博士

國立交通大學電信工程學系碩士班

摘要

傳統上晶片上熱分佈分析主要是考慮不具隨機性的功率消耗的熱傳方程式，

然而，隨著製程的演進，導致在參數如電晶體通道長度和氧化層厚度的變異

波動對於電路的效能、功率消耗、可靠度上有重大的影響。在晶片設計階段

時忽略製程上的變異將會造成嚴重的良率問題。在這篇論文，我們提出一個

方法分析晶片上統計型溫度分析考慮晶片上具有空間相關的製程變異。這篇

論文是第一篇考慮晶片上具有空間相關製程變異的統計型晶片熱分佈模擬

器，利用卡洛轉換(Karhunent-Loeve transformation)處理具有空間相關隨機過

程並且利用正交多項式(Polynomial Chaos)和隨機加勒金法(Stochastic Galerkin

method)解統計型熱傳方程式。與蒙地卡羅模擬法(Monte Carlo simulation)比較

來說明我們所提出方法的正確性和效率性。模擬的結果可以保證提供可靠的

溫度分佈良率，並且指引設計者去避免晶片熱毀壞的問題在次微米半導體時

代。最後我們更指出提供精確的晶片上溫度分佈不能忽略空間相關製程變異。

(3)

Stochastic Thermal Simulator Considering Within-die

Spatial Correlation under Process Variations

Student : Jia-Hong Wu Advisor : Dr. Yu-Min Lee

Department of Communication Engineering

National Chiao Tung University

ABSTRACT

Traditionally, the thermal analysis methods of chip have been conducted by

solving the heat transfer equation with deterministic heat sources. However, the

technology scaling leads to that the fluctuations in physical parameters such as

channel length and oxide thickness have a substantial impact on circuit

performance, power consumption, and reliability. Ignoring the manufactured

process variations at the design stage can cause aggravated yield losses. In this

paper, we present a method to analyze the statistical temperature distribution of

full chip under considering process variations with a known within-die spatial

correlation function. To the author’s best knowledge, this is the first stochastic

thermal simulator of full chip with considering within-die process variations. This

work makes use of the Karhunen-Loeve transformation to deal with the physical

parameters with spatial correlation and takes advantage of polynomial chaos and

stochastic Galerkin method to tackle the stochastic heat transfer equation. We

demonstrate the accuracy and efficiency of the proposed methodology in

comparison to Monte Carlo simulation. The simulation results guarantee the

robust thermal yield and can guide designers to avoid the thermal failure in

nano-meter technology. Furthermore, we point out that the within-die spatial

correlation can not be neglected for the accurate temperature estimation.

(4)

誌謝

這篇論文能夠順利地完成，首先要由衷地感謝我的指導教授李

育民博士，每當我困惑徬徨時，老師總能指引我一條光明的路，以致

能夠讓這篇論文完成。在這碩士的兩年裡，實驗室與老師所給予我的

訓練和教導，對於我之後就業或是進修的影響，是具有相當大的部份

，在實驗室的這兩年，我發現我已成長不少，更學習到許多研究與處

理問題的方法，最後我要感謝老師與實驗室在這兩年來所帶給我的一

切。

關於實驗的進行，相當感謝李義明老師能夠提供我們台積電 65

奈米製程參數，如果沒有李義明老師的幫忙，實驗的進行將會變的相

當困難。在此深深致上對李老師的謝意。

在實驗室裡，感謝培育學長、柏毅學長寶貴的知識經驗傳承，以

及國富哥與實驗室學弟焯基、斯安、懷中、宗祐平日的關心與幫忙，

紓解生活壓力，一路相伴與成長。

最後要深深地感謝我的父母親，你們的辛苦及無悔的付出栽培我

能夠順利完成碩士學業，也感謝 Doci 的陪伴與關懷。僅在此將本論

文獻給你們，共享這份喜悅與榮耀。

(5)

List of Figures

1.1 Leakage variations [13] . . . 2

2.1 Physical design flow . . . 6

2.2 The eigenvalues decay rate . . . 13

2.3 The exponential covariance kernel . . . 14

2.4 (a) 25-term approximation of covariance surface ; σ = 4.3. (b) 25-term relative error surface of covariance approxiamtion. . . 15

2.5 (a) 75-term approximation of covariance surface ; σ = 4.3. (b) 75-term relative error surface of covariance approxiamtion. . . 15

2.6 Procedure of Multinormal-Cholesky . . . 20

2.7 Procedure of Cholesky-Decomposition . . . 21

3.1 Compact thermal model of the early design stage for stochastic heat sources. . . 22

3.2 Stochastic thermal simulation flowchart . . . 25

3.3 Procedure of gate power projection . . . 28

3.4 Procedure of sub power projection . . . 30

4.1 The floorplan of our test circuit . . . 32

4.2 (a) The nominal power distribution at the top surface of die, (b) The mean power distribution at the top surface of die. . . 34

4.3 (a) The 3D nominal temperature distribution at the top surface of die, (b) The 3D mean temperature distribution at the top surface of die. . . 35

4.4 (a) The 2D nominal temperature distribution at the top surface of die, (b) The 2D mean temperature distribution at the top surface of die. . . 35

4.5 (a) The 3D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b) The 3D standard deviation temperature distribution at the top surface of die without considering spatial correlation. . . 37

4.6 (a)The 2D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b)The 2D standard deviation temperature distribution at the top surface of die without considering spatial correlation. . . 37

4.7 The mean value plus three standard deviation of temperature distribution. . . . 39

(7)

List of Tables

(8)

Chapter 1 Introduction

1.1 Introduction

Because of the drastic increase in power consumption of integrated circuits, thermal issues have become the important concerns in VLSI manufacturing. The high temperature distribution and thermal gradients have substantial impacts on timing, performance, power, and reliability. The elevated temperature is mainly caused by huge power consumption. Furthermore, the leakage power is expected to increase drastically and become the dominated part of total power con-sumption. The subthreshold leakage and gate tunneling leakage are the major components of leakage in advanced CMOS technology. The lower quantity of transistor threshold voltage re-sults in an exponential increase in subthreshold leakage current. To control the short channel effect and to enhance transistor driving strength, the thinner oxide thickness causes the consid-erable gate tunneling leakage current.

An important concern of VLSI design and manufacturing in nanometer technology is the process variation. As the technology scaling, the decreased controllability of processes has re-sulted in the substantial variations of circuit performance. Generally, process variations can be classified into die-to-die variation and within-die variation. The die-to-die variation is mainly caused by the thermal gradients, equipment properties, wafer polishing, and wafer placement. The die-to-die variation varies slowly and behaves smoothly at the large scale chip. Thus, the die-to-die variation can be averaged over the die and be incorporated into mean value [1]. The within-die variation is generally caused by pattern planarization in chemical mechanical pol-ishing, and lithography effects [2]. The within-die variation affects the same type device at different location within a chip differently and exhibits spatial correlation. The spatial

(9)

correla-tion within die means that the devices close to each other have more similar behavior than those which are located far away. It has been shown that the within-die variations are the most im-portant parts of all system variations which influence circuit performance [3]. Noted that a 10% variation in oxide thickness causes a 15X difference in gate tunneling current for the 100nm BPTM process technology [13] (Fig. 1.1). Moreover, the considerable variations in chip level leakage power is expected as high as 20X in the literature [24] and the related fluctuation in temperature distribution is considerable. It is worth to note that the phenomenon of variations on leakage powers is more aggravated beyond the 65nm technology. The worst case deter-ministic simulation can result in immoderate guard-banding, and causes low performance [23]. Furthermore, the underestimation in temperature and power consumption of circuits can lead to unnecessary low yield. These undesirable phenomena bring about the statistical thermal simulation being essential, especially for the leakage power dominated technology.

(10)

1.2 Motivation

The existing thermal simulating techniques can be classified into two categories, numerical methods and analytical methods. The numerical based thermal simulators convert heat trans-fer equations to equivalent RC networks and several efficient approaches have been devel-oped [16, 17, 18]. Analytical methods which avoid performing directly the volume meshing of entire substrate apply closed-forms to represent the temperature distribution, such as the Green’s function based method [19], and the Generalized Integral Transforms (GIT) based method [14]. However, all these works view the power dissipation of chip to be deterministic, and result in the optimistic estimate of temperature distribution. Those optimistic simulation results can gravely decrease the manufactured yield. Therefore, reliable and robust thermal simulator must involve process variations in physical parameters and leakage power consumption.

In this work, we propose a stochastic thermal simulation procedure and consider within-die process variations for the leakage power dominated design. With the help of Karhunen-Loeve expansion [9], we transform the random processes of physical parameters such as chan-nel length and oxide thickness with known spatial correlation to a set of uncorrelated random variables. After transforming parameters, we employ the polynomial chaos scheme and stochas-tic Galerkin procedure to convert the stochasstochas-tic heat transfer equations to a set of determinisstochas-tic problems. The formulas of temperature distribution are determined by applying an efficient deterministic thermal solver [14] to deal with the set of deterministic heat transfer equations. In this work, we are going to evaluate the means and variances of full-chip temperature distri-bution and demonstrate the huge difference of simulation results between considering spatial correlation and ignoring spatial correlation.

1.3 Our Contributions

In this work, we propose a stochastic thermal simulator and consider within-die process varia-tions for leakage power dominated design. By using Karhunen-Loeve expansion, we transform the random processes of physical parameters (channel length and oxide thickness) with spatial correlation to a set of uncorrelated random variables. After transforming parameters, we em-ploy the polynomial chaos scheme and stochastic Galerkin procedure to convert the stochastic

(11)

thermal problem to a set of deterministic problems. Then, applying an efficient deterministic thermal solver [14] to obtain the final solutions. We are going to evaluate the mean value and variance of full-chip temperature distribution and demonstrate the huge difference of simulation results between considering spatial correlation and ignoring spatial correlation. To the author’s best knowledge, this is the first stochastic thermal simulator of full chip with considering within-die process variations in the nano-meter technology.

1.4 Organization of the Thesis

The rest of this thesis is organized as the follows. First, the detailed parameters modeling in this work, polynomial chaos, Karhunen-Loeve expansion, and Monte Carlo technique are introduced in chapter 2. The problem formulation, simulation flowchart and stochastic Galerkin procedure utilized in this work are addressed in chapter 3. Finally, the experimental results and conclusion are given in chapter 4 and 5, respectively.

(12)

Chapter 2 Preliminaries

In this chapter, we first introduce the physical design flow. Then, the parameter modeling is presented in chapter 2.2. After that, the polynomial chaos for random space is shown in chapter 2.3. Finally, we introduce the statistical leakage power modeling in chapter 2.4 and Monte Carlo technique in chapter 2.5.

2.1 Physical Design Flow

The physical design flow of a circuit shown in Fig. 2.1 is the phase that precedes the fabrication of a circuit. In most general terms, physical design refers to all synthesis steps succeeding logic design and preceding fabrication. These include logic partitioning, floorplanning, placement, routing , compaction, extraction, and verification. Floorplanning is an essential design step when a hierarchical/building design methodology is used. Floorplanning helps designers define the layout hierarchy, estimate the overall required area, determine the aspect ratio for each mod-ule. It is closely related to placement. For thermal-driven floorplanning, there are a greater deal of flexibilities in mitigating thermal problems, but also large quantities of uncertainties with re-gard to the accurate thermal profile. The placement stage of physical design flow is the process of arranging the circuit components on a layout surface. The general purpose of the thermal-aware placement methodology is to minimize the maximal temperature gradient over the chip and get the uniform temperature distribution. Moreover, the thermal-driven methodologies con-sider the more detailed thermal model with interconnects at the routing stage of physical design. However, the literature [21] pointed out that the thermal-driven methodologies making efforts at the early stage of physical design flow can gain more benefits. Thus, in this work, we will

(13)

propose a thermal simulator which is suitable to the the early stage of physical design.

Circuit Synthesis

Partitioning

Floorplan & Placement

Routing

Compaction

Extraction & Verification

Fabrication

Physical

Design

(14)

2.2 Parameter Modeling

Process parameter spatial correlation has received increased attention recently. The consider-ation of within-die spatial correlconsider-ation causes the increasing number of RVs that we deal with and the computation costs. A conventional technique partitions the layout plane into several grids, assumes perfect correlation for all random variables (channel length, oxide thickness) in the same grid cell, and computes a correlation matrix for those grid cells [25]. The number of correlated random variables can be further reduced by applying principle component analysis (PCA) [26] [27]. In PCA, linear variable transformations are used for the largest data variance, or principle components. However, the nature of PCA often limits its capability in modeling high dimensional parameter variations for performance modleing. An alternative formulation to tackle with the correlated parameters is the Karhunen-Loeve expansion. A random process can be expanded in terms of a denumerable set of orthogonal random variables with deterministic functions which are related to the corresponding covariance kernel. The detailed parameters modeling and Karhunen-Loeve expansion are presented in the following subsection.

2.2.1 Karhunen-Loeve Expansion

In the presence of process variations, the physical parameters such as channel length, and oxide thickness can be modeled as random processes with given spatial covariance functions. Since the values of physical parameters are bounded above and below, we can assume that the random process α(x, ϑ) of each parameter under consideration is a second-order stochastic process, where ϑ ∈ Ω, and x = (x, y) ∈ Dp. Here, Ω is the set of manufacturing outcomes for a specific

physical parameter, and Dpis the domain in x- and y- directions of chip.

DEFINITION 2.1 [6] A second-order random variable α(ϑ) is one satisfying E[|α(ϑ)|2] < ∞. A second order stochastic process α(x, ϑ) is a family of second-order random variables.

DEFINITION 2.2 [6] A second-order process α(ϑ) is continuous in quadratic mean (q.m.)

if E[|α(x + h, ϑ) − α(x, ϑ)|2_{] → 0 as khk → 0 for all x ∈ D} p .

where k · k is the Euclidean norm. Due to the absence of the explicit form of α(x, ϑ), we utilize the following theorem to guarantee its q.m. continuity.

(15)

only if, its covariance function C(x1, x2) is continuous at (x, x).

DEFINITION 2.3 [6] A second order q.m. continuous process α(x, ϑ) on a closed interval

Dp has an orthogonal decomposition

α(x, ϑ) = ˜α(x) + ∞ X i=1 q λifi(x)ζi(ϑ) (2.1) E[ζmζn] = δmn ; Z D fm(x)fn(x)dx = δmn (2.2)

where ˜α(x) is the mathematical expectation of the process α(x, ϑ), if, and only if, the λi are the

eigenvalues and fi(x) are the orthonormalized eigenfunctions of C(x1, x2). Then the expansion

converges in q.m. uniformly on Dp.

The Fourier-type series expansion form in equation (2.1) is the Karhunen-Loeve expansion (KLE) which is an optimal way of representing a random process based on the spectral de-composition of the given covariance kernel. The expansion converts a random process into a model with a minimum degree of freedom and minimizes the mean-square error of the finite-term representation [9]. From the viewpoint of practice, the summation finite-terms of equation (2.1) can be truncate at finite number Nkl which is determined by the decay trend of eignevalues to

ensure the acceptable error. Notes that the decay trend is crucial, since the truncated number determines the computational efficiency and complexity of the work. General speaking, the smoother covariance function behaves the faster eigenvalues of KLE decay. The eigenvalues and eigenfunctions can be derived form the following Fredholm integral equation:

Z

Dp

C(x1, x2)fn(x2)dx2 = λnfn(x1) (2.3)

Form the definition of covariance function, it’s with bounded, symmetric and positive defi-nite property. The fact guarantees a number of properties for the eigenfunctions and the eigen-values that are the solutions of equation (2.3).

1. The eigenfunction set fi(x) is orthogonal and complete.

2. The eigenvalues are all positive real numbers.

(16)

4. The covariance kernel admits of the following uniformly convergent expansion C(x1, x2) = ∞ X k=1 λkfk(x1)fk(x2) (2.4)

The physical parameter such as channel length with spatial correlation can be expanded to a Fourier-type series by Karhunen-Loeve expansion.

L(x, θ) ' ˜L(x) + Nkl X i=1 q λifi(x)ζi(θ) (2.5)

where ˜L(x) is the mean value function of channel length, λi and fi(x) are the eigenvalues and

eigenfunctions corresponding to the given covariance function, respectively, and θ ∈ ΩL. Here,

ΩL is the set of manufacturing outcomes for the channel length. The {ζi} is a set of

orthonor-mal random variables with zero mean and unit variance. The system random process can be assumed to be a Gaussian process [30], then the {ζi} is a set of standard Gaussian random

vari-ables. It can be shown that the Karhunen-Loeve expansion transforms the random process with spatial correlation to a set of uncorrelated orthonormal random variables and greatly reduces the dimension of random variables. In the same way, the oxide thickness random process can be transformed into a set of standard Gaussian random variables {ςi}. Indeed, we assume that the

oxide thickness random process is independent of channel length, so the two random variables sets ( {ζi}, {ςi} ) are independent. For notation, we arrange {ζi} and {ςi} as {ξi} = {ζi, ςj} and

{ξi} is used as the set of system random variables to expand bases for the random space. In the

following subsection, we will introduce the adoptive covariance function for physical parameter random processes, and derive the eigenvalues and eigenfunctions of Karhunen-Loeve expansion for the corresponding covariance function.

2.2.2 Spatial Correlation Modeling

Recently, the spatial correlation of within-die variations has been seriously taken into account of the VLSI verification flow, and a precise spatial correlation function is necessary for catching the manufacturing information. To extract the features of process variations for modeling and constructing the covariance kernel is mainly based on the measured data, and several robust techniques have been proposed to build the valid spatial covariance function [4] [5] with having the positive semidefinite property [8]. In this paper, we adopt the spatial covariance function

(17)

introduced in [4] [6]1_. C(x1, x2) = σ2e −|x1−x2| ηx e− |y1−y2| ηy _(2.6)

where x1 = (x1, y1), x2 = (x2, y2), and ηx and ηy are the correlation lengthes in the x- and

y- directions, respectively. The term σ indicates the standard deviation of the random process.

The covariance kernel is defined in the rectangular domain Dp. In general, the correlation of

parameters for two identical devices drops down as their distance increases. The literature [7] further pointed that the correlation approaches zero as the distance nears half of chip dimension for two logical gates. This observation means that the ratio of correlation length for different chip sizes is constant for different design benchmark.

For the multi-dimensional problem, if we assume that the given covariance function is sep-arable, the solutions of equation (2.3) can be derived independently for x1 and y1 directions.

For the covariance kernel in this work, the eigenvalues and eigenfunctions of one dimension problem can be expressed as the following [9]:

λn,x = 2ηxσ2 η2 xωn,x2 + 1 (2.7) fn(x) = an,1cos(wn,xx) + an,2sin(wn,xx) (2.8) an,2 = 1 q (η2 xwn,x2 + 1)Lx/2 + ηx (2.9) an,1 = ηxwn,xan,2 (2.10)

where wn,x are the positive roots of the following characteristic equation

(η_x2w2_x− 1)sin(wxLx) = 2ηxwxcos(wxLx) (2.11)

the above characteristic equation can be obtained from combining the boundary condition and equation 2.3.

The multi-dimensional eigenvlaues and eigenfunctions can be combined from x and y di-rections. λn= 4ηxηyσ2 [η2 xω2i,x+ 1][ηy2ωj,y2 + 1] (2.12) fn(x, y) = fi(x)fj(y) (2.13)

1_{Although we choose this specific spatial covariance function in this work, our simulation flow can be applied} to any valid spatial covariance function.

(18)

where ωi,x and ωj,y are the solutions of equation 2.11 for setting parameters (Lx, ηx) and

(Ly, ηy), respectively. The relationship between the indices (i, j) and index n leads to the

eigenvalues λnform a decreasing series.

The detailed procedure of solving the eigenvalues and eigenfunctions of given covariance kernel in this work is presented. From the following Fredholm equation:

Z

D

C(x1, x2, y1, y2)f (x1, y1)dx1dy1 = λf (x2, y2) (2.14)

The above eigenvalues problem can be solved independently for x and y directions to obtain eigenvalues λxand λy, and eigenfunctions fx(x) and fy(y).

Z D σ2e−|x1−x2|ηx e− |y1−y2| ηy _{f (x} 1, y1)dx1dy1 = λf (x2, y2) (2.15) Z D σ2e−|x1−x2|ηx e− |y1−y2| ηy _f x(x1)fy(y1)dx1dy1 = λxλyfx(x2)fy(y2) (2.16)

For the separable multi-dimension problem, we consider the following one-dimensional eigen-values problem and extend to the two-dimension soultions :

Z Lx

0

σ2e−|x1−x2|ηx f_x(x₁)dx₁ = λ_xf_x(x₂) _(2.17)

Taking derivative of equation 2.17 with respect to x2yields

−1 ηx Z x2 0 ex1−x2ηx _f x(x1)dx1 + 1 ηx Z Lx x2 ex2−x1ηx _f x(x1)dx1 = λx σ2f 0 x(x2) (2.18)

Taking derivative again of equation 2.18 with respect to x2gives

−1 ηx [fx(x2) − 1 ηx Z x2 0 ex1−x2ηx f_x(x₁)dx₁] − 1 ηx [fx(x2) + Z x2 Lx ex2−x1ηx φ_x(x₁)dx₁] = λx σ2f 00 x(x2) (2.19) −2 ηx fx(x2) + 1 η2 x [ Z x2 0 ex1−x2ηx f_x(x₁)dx₁+ Z Lx x2 ex2−x1ηx f_x(x₁)dx₁] = λx σ2f 00 x(x2) (2.20) λx σ2f 00 x(x2) = −2 ηx fx(x2) + 1 η2 x [λx σ2fx(x2)] (2.21)

(19)

Moreover,

f_x00(x2) +

2ηxσ2− λx

λxη2x

fx(x2) = 0 (2.22)

The boundary condition associated with equation 2.22 can be determined from equation (2.17) by setting x2 = 0 and x2 = Lx. ηxfx 0 (0) = fx(0) ηxfx 0 (Lx) = −fx(Lx)

The general solution of equation 2.22 is

φx(x) = c1cos(wxx) + c2sin(wxx) (2.23) w2_x = 2ηxσ 2_{− λ} x λxηx2 (2.24)

Based on the boundary condition, we can obtain the following equations for determining coefficients c1and c2.

c1− ηxwxc2 = 0 (2.25)

[−ηxwxsin(wxLx) + cos(wxLx)]c1+ [ηxwxcos(wxLx) + sin(wxLx)]c2 = 0 (2.26)

Limiting to nontrivial solutions of equation 2.26 yields an equations for wx,

(η_x2w2_x− 1)sin(wxLx) = 2ηxwxcos(wxLx) (2.27)

For given ηxand Lx, the equation 2.26 can be solved to get a series of (positive)wi,x, i = 1, 2, ....

The eigenvalues corresponding to wi,xcan be determined as the following :

λi,x =

2ηxσ2

η2

xwi,x2+ 1

(2.28)

In fact, the different wi,x gives the different coefficients ci,1and ci,2for eigenfuctions.

(20)

The coefficients ci,1 and ci,2 can be determined by the condition that the eigenfunctions are normalized. Z Dp f_i,x2 (x)dx = 1 (2.29) ci,2 = 1 q (η2 xwi,x2+ 1)Lx/2 + ηx (2.30)

ci,1 = ηxwi,xci,2 (2.31)

Trends of the two dimensional eigenvalues λn of exponential kernel for the correlation

lengths (ηx and ηy) which are equal to 0.31 can be shown in Fig. 2.2.

0 10 20 30 40 50 60 70 80 90 100 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

The index of two−dimensional eigenvlaues

The normalized two−dimensional eigenvlaues

The Normalized Eigenvalues Decay Rate

The normalized eigenvalues

(21)

The exponential kernel C(x1, x2) = σ2e

−|x1−x2|

ηx e− |y1−y2|

ηy _{can be expanded as its}

eigenfunc-tions and eigenvalues.

C(x1, x2) = Nkl X

k=1

λkfk(x1)fk(x2) (2.32)

where Nkl is the truncated number for approximation. As shown in Fig. 2.3, it’s the adoptive

exponential covariance surface in this work. The approximated covariance surface and relative errors for truncated number Nkl which is equal to 25 are shown in Fig. 2.4. Moreover, the

approximated covariance surface for truncated number Nkl which is equal to 75 are shown in

Fig. 2.5.

(22)

(a) (b)

Fig. 2.4: (a) 25-term approximation of covariance surface ; σ = 4.3. (b) 25-term relative error surface of covariance approxiamtion.

(a) (b)

Fig. 2.5: (a) 75-term approximation of covariance surface ; σ = 4.3. (b) 75-term relative error surface of covariance approxiamtion.

(23)

2.3 The Bases for Random Space : Polynomial Chaos

The generalized polynomial chaos, also called the Askey-Chaos, utilizes the orthogonal poly-nomials as the trial basis in the random space to expand the stochastic process. The original polynomial chaos which is termed as the Hermite chaos was first introduced by Wiener [9]. Ghanem and Spanos are the pioneers that employ the Hermite orthogonal polynomials in terms of Gaussian random variable to deal with various problems in mechanics [9]. The theorem of Cameron and Martin [15] guarantees that a general second-order random process u(θ) can be represented in the following form:

u(θ) = c0Γ0+ ∞ X i1=1 ci1Γ1(ξi1) + ∞ X i1=1 i1 X i2=1 ci1i2Γ2(ξi1, ξi2) + ∞ X i1=1 i1 X i2=1 i2 X i3=1 ci1i2i3Γ3(ξi1, ξi2, ξi3) + ...

where Γr(ξi1, ..., ξin) represents the polynomial chaos of order r in terms of the N-dimensional

random variables ~ξ = (ξi1, ..., ξin). The polynomial chaos was so-called Hermite polynomial

chaos for the Gaussian random variables. For the Hermite polynomials with multi-dimension

Γr(ξi1, ..., ξiN), the general expression form can be obtained as

Γr(ξi1, ..., ξiN) = (−1)

r ∂n

∂ξi1...ξin

e−12~ξ T~_ξ

The zero, first, and second-order Hermite polynomial chaos can be given by:

Γ0 = 1; Γ1(ξi) = ξi; Γ2(ξi) = ξiξj− δij

where δij is the Kronecker delta. For charity, the above general second-order random process

u(θ) can be expressed as more concise form

u(θ) = ∞ X j=1 ˆ ajΦj(~ξ) (2.33)

where there is a one-to-one mapping between the polynomial chaos Γ[.] and Φ[.], and also between the coefficients ˆaj and ci1...in. The polynomial chaos of the same order with different

(24)

polynomial chaos satisfy the following orthogonality property:

< ΦiΦj >=< Φ2i > δij

where < . > denotes the inter product defined in the following:

< f (~ξ)g(~ξ) >= q 1 (2π)n Z f (~ξ)g(~ξ)e−12~ξ T~_ξ d~ξ

2.4 Statistical Leakage Power Modeling

In this section, we will introduce the empirical models for subthreshold and gate leakage cur-rents with the uncertainty in physical parameters such as channel length and oxide thickness. Actually, the leakage current depends the input pattern and logic topology. We evaluate the av-erage leakage based on HSPICE simulation for various types logic gate with considering input pattern. From the the HSPICE simulation results, we obtain the fitting constants of the empirical current models based on least square method. Moreover, the maximum errors of fitting model are no more than 2% in comparison with HSPICE simulation results.

2.4.1 Gate Tunneling Leakage Current

According to quantum mechanics, there is a finite probability that carriers will tunnel through the gate oxide. The result is so-called that gate tunneling leakage current flows into the gate. The finite probability is exponential function of oxide thickness. The gate tunneling leakage current increases exponentially as gate oxide decreases. When the oxide thickness is thicker than 20 ˚A, the gate tunneling leakage current is relatively small in comparison to other

compo-nent leakage current such as sub-threshold leakage current. For oxide thickness thinner than 15−20 ˚A, tunneling current becomes a important factor and may become comparable to

sub-threshold leakage current in advanced process. To put it briefly, the dependence of gate leakage current on oxide thickness is given by the following formula [32]:

Igate = (A · C)(W · L)e −B·Tox_Vgsα where A = q3/8πhφb, B = 8π √ 2moxφ 3/2 b /3hq , C = (Vgs/Tox)2, α is a parameter which is

(25)

and φb is the barrier height for electronics/holes in the conduction/valance band. Noted that the

parameter variations are in general around 10-20% [33]. Hence, we make use of a first-order Taylor expansion at the nominal value of parameter oxide thickness and utilize the following gate tunneling leakage current model derived in [28].

Igate = a0ea1∆Tox (2.34)

where a0, and a1 are the fitting constant, ∆Tox indicates the fractional variations at nominal

value of oxide thickness. We incorporate the current model with the physical parameter ran-dom process expanded by KLE and set the supply voltage equal to 1 volt, the stochastic gate tunneling leakage power may be expressed as:

pg(~ς) = ˜pgea1 PN 0 kl i=1f 0 i(x∗,y∗)ςi _(2.35) where ˜pg = a0ea1 ˜

Tox _{denotes the deterministic nominal gate leakage power, ˜}_T

oxis the nominal

value of oxide thickness, N_kl0 is the truncated number of KLE for oxide thickness, f_i0(x∗, y∗)

is the eigenfunction combined with eigenvalue term of oxide thickness for some logic gate at position (x∗, y∗). The gate power random process can be represent in terms of Hermite

polynomials, pg(~ς) = ∞ X i=1 γiΦi(~ξ) (2.36) γi = < pg(~ς)Φi(~ξ) > < Φ2 i(~ξ) > (2.37)

2.4.2 Subthreshold Leakage Current

The subthreshold leakage current is defined as the conduction current between source and drain in an ”off” state MOS transistor. We apply the following empirical model was introduced in [10] to describe subthreshold leakage current.

Isub= b0eb1L+b2L

2

(2.38)

where b0, b1, and b2 are the fitting constant, L indicates channel length. Substituting the

(26)

statistical subthreshold leakage power may be given by: ps(~ζ) = p˜se(b1+2b2 ˜ L)PN 00 kl i=1f 00 i (x ∗_,y∗_)ζ i+b2q(x∗,y∗,~ζ) _(2.39) q(x∗, y∗, ~ζ) = { Nkl X i=1 f_i00(x∗, y∗)ζi}2 (2.40)

where ˜ps denotes the nominal subthreshold leakage power, ˜L is the nominal value of channel

length, N_kl00 is the truncated number of KLE for channel length, f_i00(x∗, y∗) is the

eigenfunc-tion combined with eigenvalue term of channel length for some logic gate at posieigenfunc-tion (x∗, y∗).

Expanding the subthreshold power random process as Hermite polynomials expansion,

ps(~ζ) = ∞ X i=1 αiΦi(~ξ) (2.41) αi = < ps(~ζ)Φi(~ξ) > < Φ2 i(~ξ) > (2.42) The key point is that how to obtain the coefficients {γi} and {αi}. The computation of the the

coefficients will be introduced in chapter 3.3.

Actually, the subthreshold and gate leakage strongly depends on the input pattern and cir-cuits topology. We evaluate the leakage currents with considering input pattern to obtain the average leakage current from HSPICE simulation based on TSMC 65nm technology model [29]. The fitting constants a0, a1, b0, b1, b2 are obtained from the least square fitting method

with maximum errors no more than 2%.

Noted that the total number of polynomial chaos depends on the value (N_kl0 + N_kl00),

dimen-sion of random variables set {ξi} truncated from KLE. In fact, the set of polynomial chaos can

be reduced to a new one, because the projection value of gate and subthreshold leakage power upon the polynomial basis which is function of ~ς and ~ζ will be equal to zero. For example,

< ps(~ζ)ζiςj > = < ps(~ζ)ζi >< ςj > = 0

< pg(~ς)ζiςj > = < pg(~ς)ςj >< ζi > = 0 (2.43)

Thus, polynomial basis of the new set of polynomial chaos is function of either ~ς or ~ζ. The total

number of new polynomial chaos set may be given by,

Npc= 1 + p X s=1 s−1 Y r=0 (N_kl0 + r) + p X s=1 s−1 Y r=0 (N_kl00 + r) (2.44)

(27)

2.5 Monte Carlo Technique

Numerical methods that make use of random variables are known as Monte Carlo methods. This will serve as a benchmark against which all modeling and analysis techniques will be tested. In this work, we perform the Monte Carlo method to simulate the golden solutions for stochas-tic thermal analysis of full chip with considering within-die spatial correlation under process variations. An important key point is that how to generate multinormal distribution random variables. First, the within-die spatial correlation of parameters are modeled by partitioning the chip into N grid cells. Moreover, we assume that perfect correlations among the devices in the same grid cell, high correlations among those in close grid cells and low or zero correlations in far-away grid cells. Noted that the dimension of random variables set is N . Then, we con-struct the covariance matrix Σ with dimension N by N based on the given covariance kernel. From the covariance matrix Σ, we generate the multinormal distribution random variables by applying Cholesky factorization and the detail procedures are given in Fig. 2.6 and 2.7 [34].

Algorithm Multinormal-Cholesky

Input: Dimension N , covariance matrix Σ.

Output: Multinormal distributed vector ~X with mean 0 and covariance Σ.

1 Begin

2 Compute Cholesky factor L of Σ by Algorithm Cholesky-Decomposition in Fig. 2.7. 3 Generate vector ~ = (1, 2, ..., N)T of N independent standard normal variates.

4 X ← L~~

5 Return vector ~X

6 End

(28)

Algorithm Cholesky-Decomposition

Input: Positive definite N × N matrix Σ = σij.

Output: Lower triangular matrix L = lij such that LLT=Σ

1 Begin 2 for i = 1 to N do 3 li1 ← σi1/ √ σi1 4 for i = 2 to N do 5 for j = 2 to i − 1 do 6 li1← (σij −Pj−1k=1likljk)/ljj 7 x ← σii−Pi−1j=1l2ij 8 if x ≥ 0 then 9 lii← √ x 10 else

11 abort (Σ not positive definite)

12 lij ←

√

x for 1 ≤ i ≤ j ≤ N

13 Return matrix L = (lij)

14 End

Fig. 2.7: Procedure of Cholesky-Decomposition

Using the technique, samples of the required random variables to perform Monte Carlo analysis can be generated. For most purposes, variations in VLSI design are assumed to be Gaussian. Consequently, while analyzing intra-die variations, we need to generate samples of multinormal random variables.

(29)

Chapter 3 Stochastic Thermal Simulation

Methodology

3.1 Stochastic Thermal Simulation Problem Formulation

The silicon die consumes dynamic power and leakage power, and is the main source of heat generation. Heat of the silicon die can be transfered to the ambient by two heat flow paths. The first primary heat flow path is through thermal interface material, heat spreader, and heat sink. The secondary heat flow path is through the interconnect layers, ceramic substrate, and printed-circuit board. The typical compact thermal model for the early stage VLSI design flow is shown in Fig. 3.1.

Heat Sink Heat Spreader

Ambient Air

Die

Primary Heat Flow Path

Interconnect Layers

I/O Pads & PCB

h

s

Secondary Heat Flow Path

Ambient Air z=-L_z p

h

( , , ( , , ), _ox( , , )) p r t L x yθ T x yϖ z=0 y=Ly z y x=Lx x Die

Thermal Interface Material Substrate

Fig. 3.1: Compact thermal model of the early design stage for stochastic heat sources.

(30)

deterministic [31]. However, the leakage power dissipation of ICs is not deterministic any more for sub-65nm technology. As the CMOS technology continuously scales down, the existing fluctuations in physical parameters such as channel length and oxide thickness result in the leakage power consumption with uncertainty. Moreover, the leakage power has became the major contributor of the total power consumption for VLSI in today’s technology. Thus, the thermal simulation in leakage power dominated technology must combine into statistics. By combining the boundary condition for compact thermal model and stochastic power dissipation process, the stochastic heat transfer equation with boundary conditions is given as [20]

∇ · (κ(r)∇T (r, t, θ, $)) = σ(r)∂T (r, t, θ, $)

∂t ; r ∈ D (3.1)

κ(r)∂T (r, t, θ, $) ∂nbs

+ hbsT (r, t, θ, $) = fbs(r) (3.2)

where r = (x, y, z) is defined in the system domain D = {(0, Lx) × (0, Ly) × (−Lz, 0)}, Lx

and Ly are the lateral sizes of die, Lz is the thickness of die, κ(r) is the thermal conductivity

(W/m·◦C) of die, σ(r) is the product of the material density and specific heat (J/m3_·◦_{C) of}

die, ∇ is the diverge operator, hbs is the heat-transfer coefficient on the boundary surface, bs, of

die, fbs(r) is the heat flux function on the boundary surface, and ∂/∂nbs is the differentiation

along the outward direction normal to the boundary surface. θ and $ belong to the set of manufacturing outcomes for channel length ΩLand oxide thickness ΩTox, respectively.

From the observations in [21] [22], the heat transfer coefficients of primary path can be modeled as an effective heat transfer coefficient hpby combining the effect of each component

on the primary path. Hence, the detail information of interconnect layer is not available in the early physical design stage, the interconnect layer was modeled as an equivalent thermal resistance based on the material density of regular structure by [21] [22]. The heat transfer coefficients of secondary path can be simplified to be an equivalent heat transfer coefficient hs

by stacking the thermal resistance of each interconnect layer, I/O pads, and print circuit board. The boundary condition in vertical surface of chip in Fig. 3.1 can be set to be adiabatic because the area of vertical surface is exceedingly smaller than the area of horizontal surface and the thermal conductivity of air is much less than the values of primary and secondary heat transfer paths [19]. The heat sources generated from different sub-circuits can be attached on the top

(31)

surface of die for modeling the boundary condition. Although the thermal properties of die,

κ(r) and σ(r), are position-dependent, the variations of these thermal parameters are usually

not significant and can be treated as constants while performing the thermal-aware floorplanning and placement.

With the above description, the stochastic heat transfer equation can be rewritten as

κ∇2T (r, t, θ, $) = σ∂T (r, t, θ, $) ∂t ; r ∈ D (3.3) ∂T (r, t, θ, $) ∂x _x=0,L x = ∂T (r, t, θ, $) ∂y _y=0,L y = 0 (3.4) κ ∂T (r, t, θ, $) ∂z _z=−L z = hpT (r, t, θ, $)|z=−Lz (3.5) κ ∂T (r, t, θ, $) ∂z _z=0 = hsT (r, t, θ, $)|z=0 (3.6) + p(r, t, L(r, θ), Tox(r, $))|z=0

where p(r, t, L(r, θ), Tox(r, $)) is the random process of total power dissipation and it consists

of dynamic power pd(r, t), subthreshold leakage power ps(r, t, L(r)), and gate leakage power

ps(r, t, Tox(r, $)). The leakage power is greatly affected by physical parameters with

uncer-tainties such as channel length and oxide thickness, and need to be treated as a random process. The detail illustration of total leakage power random process will be addressed in chapter 2.4.

3.2 Stochastic Thermal Simulation Flowchart

The executing flow of this work can be summarized as Fig. 3.2. Given a spatial covariance function of technology parameter, we construct the eigenvalues and eigenfunctions of the co-variance kernel. By applying the Karhunen-Loeve expansion method, the correlated physical parameters random processes (channel length, and oxide thickness) are transformed into a set of uncorrelated random variables based on these eigenvalues and eigenfunctions. With those normalized random variables, we build the polynomial chaoses to serve as polynomial bases for the space of random variables. According to the power consumption, we create the leakage current models for various type logic gates from HSPICE simulation based on TSMC 65nm technology. After the chip geometry, the package configuration, gate level placement, and dy-namic power distribution being given, the compact thermal model of Fig. 3.1 in chapter 3.1 can

(32)

be built. Then, we employ the stochastic Galerkin projection method to convert the stochastic heat transfer equation to a set of deterministic heat transfer equations. The number of those de-terministic heat transfer equations is equal to the total number of polynomial chaoses. Finally, an efficient GIT based analytical thermal simulator [14] is utilized to solve those deterministic heat transfer equations, and the mean value and variance of full-chip temperature distribution can be obtained.

Circuit Benchmark (DEF) and Dynamic Power Technology Parameters Spatial Correlation Modeling Karhunen-Loeve Expansion Parameters Transform Cell Library (LEF) Leakage Power Cell Library

Stochastic Galerkin Projection

Stochastic Heat Transfer Equation

Expanded as the Hermite Polynomial Chaos with Function of Position

Coefficients

Mean and Variance of the Temperature Distribution

Result

Construct Hermite Polynomial Chaos

Random Space Bases

Analytic Thermal Simulator: GIT

Deterministic Heat Transfer Equation

Construct A Set of Spatial Space Bases to Obtain the Function of Position Coefficients

(33)

3.3 Stochastic Galerkin Procedure

After applying the Karhunen-Loeve expansion, the system random process with spatial corre-lation can be transformed into a set of orthonormal standard normal random variables without correlation. From the set of uncorrelated random variables ~ξ, we construct the Hermite

polyno-mial chaos to serve as the bases for random space. The above-mentioned temperature random process T (r, t, ~ς, ~ζ) can be reformed and be expanded as the following expansion form by using

the Hermite polynomial chaos expansion.

T (r, t, ~ξ) ' Npc X i=0 ˆ Ti(r, t)Φi(~ξ) (3.7)

Substituting the expansion form equation (3.7) into equation (3.3), the residual can be expressed as: R(r, t, ~ξ) ≡ κ∇2 Npc X i=0 ˆ Ti(r, t)Φ(~ξ) − σ ∂ ∂t Npc X i=0 ˆ Ti(r, t)Φ(~ξ) (3.8)

Utilizing the stochastic Galerkin principle which enforces the residual to be orthogonal to each of the basis functions,

< R(r, t, ~ξ)Φk(~ξ) >= 0 (k = 0, 1, ...Npc) (3.9)

The orthogonality relation results in a set of deterministic equations with dimension Npc.

κ∇2Tˆk(r, t) = σ

∂ ˆTk(r, t)

∂t (k = 0, 1, ...Npc) (3.10)

Employing similar Galerkin projection procedure on the system boundary conditions equation (3.4)-(3.5), the resulting transformed equations can be given by,

∂ ˆTk(r, t ∂x _x=0,L x = ∂ ˆTk(r, t) ∂y _y=0,L y = 0 (3.11) κ ∂ ˆTk(r, t) ∂z _z=−L z = hpTˆk(r, t)|z=−Lz (3.12)

Substituting the expansion form equation (3.7) into equation (3.6), the residual can be given by: R0(r, t, ~ξ) ≡ ∂ ∂z Npc X_ˆ Ti(r, t)Φ(~ξ) − hs Npc X _ˆ Ti(r, t)Φ(~ξ) − p(r, t, ~ξ) (3.13)

(34)

Applying the stochastic Galerkin projection procedure to the residual R0(r, t, ~ξ), ∂ ∂zTˆk(r, t) = hsTˆk(r, t) + < p(r, t, ~ξ)Φk(~ξ) > < Φ2 k(~ξ) > ; (k = 1, 2, ...Npc) (3.14)

Once the right side second term of ”=” for equation (3.14) be determined, the deterministic heat transfer equations can be formulated. Now, focusing on the leakage power projection term,

< p(x, y, t, ~ξ)Φk(~ξ) > = pd(x, y, t) < Φk(~ξ) > (3.15)

+ Hg(x, y, t) < pg(x, y, ς)Φk(~ξ) >

+ Hs(x, y, t) < ps(x, y, ζ)Φk(~ξ) >

where Hg(x, y, t) and Hs(x, y, t) are the function of position and switching activity for gate

tunneling and subthreshold leakage power, respectively.

Here, considering the gate tunneling power projection term,

< pg(x, y, ~ς)Φk(~ξ) >= ˜pg(x, y) < ea1(x,y) PNkl i=1f 0 i(x,y)ςi_Φ k(~ξ) > (3.16)

where ˜pg(x, y) = a0(x, y)ea1(x,y) ˜Tox(x,y)is the deterministic nominal gate leakage power, f

0

i(x, y)

is the eigenfunction combined eigenvalue term for oxide thickness, a0(x, y) and a1(x, y) are the

fitting constants of gate leakage power for different logic gate located different position. We take an example for a reference position (x∗, y∗), equation (3.16) can be rewritten as

< pg(x∗, y∗, ~ς)Φk(~ξ) >= ˜p∗g < e

a∗₁PNkl_i=1f_i0∗ςi_Φ

k(~ξ) > (3.17)

If the polynomial chaos Φk(~ξ) is function of ~ζ, the value of equation 3.17 is zero. on the

contrary, let us consider an example of Φk(~ξ) is a first order polynomial and function of ~ς.

Noted that normal random variables set X = {X1, X2, ..., Xn} has the following property,

< e Pn j=1βjXj_X k>= βk n Y i=1 e β2 i 2 ∀k ∈ 1, 2, ..., n (3.18)

The computation of equation (3.17) can be calculated based on equation (3.18) for first order polynomial, and based on the following equation (3.19) for second order polynomial.

< e Pn j=1βjXj_X2 k >= (βk2+ 1) n Y i=1 e β2 i 2 ∀k ∈ 1, 2, ..., n (3.19)

(35)

Algorithm Gate Power Projection Procedure

Input: The constants a∗₁, {f_i0∗}, and polynomial Φk(~ξ) Output: Return the value of< ea∗₁PN

0 kl i=1f 0∗ i ςi_Φ k(~ξ) > 1 Begin

2 if the polynomial Φk(~ξ) contains ζi

3 Return 0 4 endif 5 do C_g∗ =QN 0 kl i=1e (a∗₁f_i0∗)2 2 6 if the order of Φk(~ξ) = 1 7 Return f_k0∗C_g∗ 8 endif

9 elseif the order of Φk(~ξ) = 2

10 Return [(f_k0∗)2_{+ 1]C}∗ g

11 endif

12 End

Fig. 3.3: Procedure of gate power projection Now, considering the subthreshold leakage power projection term,

< ps(x, y, ~ζ)Φk(~ξ) >= ˜ps(x, y) < eB(x,y) PNkl i=1f 00 i(x,y)ζi+b2(x,y)q(x,y,~ζ)_Φ k(~ξ) > (3.20) where, B(x, y) = b1(x, y) + 2b2(x, y) ˜L(x, y) (3.21) q(x, y, ~ζ) = { Nkl X i=1 f_i00(x, y)ζi}2 (3.22)

where ˜ps(x, y) = b0(x, y)eb1(x,y) ˜L(x,y)+b2(x,y) ˜L

2_(x,y)

denotes the deterministic nominal subthresh-old leakage power, f_i00(x, y) is the eigenfunction combined eigenvalue term for channel length, b0(x, y), b1(x, y), and b2(x, y) are the fitting constants of subthreshold leakage power. If the

polynomial chaos Φk(~ξ) is the function of ~ς, the value of equation (3.20) is zero. The quadratic

form q(x∗, y∗, ~ζ) for reference position (x∗, y∗) can be expressed as ~ζTA~ζ.

A =            f₁00∗2 1₂f₁00∗f₂00∗ . . . 1₂f₁00∗f_N00∗ kl 1 2f 00_∗ 1 f 00_∗ 2 f 00_∗ 2 2 . . . 1 2f 00_∗ 2 f 00_∗ Nkl . . . . . . . . . . . . 1 2f 00_∗ 1 f 00_∗ Nkl 1 2f 00_∗ 2 f 00_∗ Nkl . . . f 00_∗ Nkl 2           

(36)

The quadratic form ~ζT_A~_{ζ can be reduced into its standard form ~}_νT_D~_{ν, ~}_{ν = [ν}

1, ν2, ..., νNkl]

T_.

The standard form of q(x∗, y∗, ~ζ) is determined once the eigenvalues of A are known, the

trans-formation between ~ζ and ~ν is given by ~ζ = Q~ν. The real symmetric matrix A have the

eigenval-ues λA

1, λA2, ..., λANkl, and let Q be an orthogonal matrix that diagonalizes A, so that Q

T_{AQ = D,}

where D is a diagonal matrix with the eigenvalues of A as the elements on its leading diagonal. After the eigen-decomposition transformation, equation (3.20) can be rewritten as the following for reference point:

< ps(x∗, y∗, ~ζ)Φk(~ξ) >= ˜p∗s < eB ∗PNkl i=1Ciνi+b ∗ 2 PNkl i=1λ A iν 2 iΦ 0 k(~ν) > (3.23) where Ci = PNj=1kl Qijf 00_∗

j , the indices i and j are the row and column index of the matrix Q,

respectively. The computation of equation (3.23) for polynomial chaos Φ0_k(~ν) which is constant

value can be based on the following property.

< eP n j=1αjX 2 j+βjXj _>= n Y i=1 e β2_j 2−4αj (1 − 2αj) 1 2 (3.24)

For the first order polynomial chaos Φ0_k(~ν),

< e Pn j=1αjX 2 j+βjXj Xk >= βk 1 − 2αk n Y i=1 e β2_j 2−4αj (1 − 2αj) 1 2 ; ∀k ∈ 1, 2, ..., n (3.25)

Moreover, for the second order,

< e Pn j=1αjX 2 j+βjXj X_k2 >= β 2 k− 2αk+ 1 (1 − 2αk)2 n Y i=1 e β2_j 2−4αj (1 − 2αj) 1 2 ; ∀k ∈ 1, 2, ..., n (3.26)

The computation of equation (3.23) can be easily derived based on equation (3.24)-(3.26) for different order polynomial chaos over the design system domain.

Noted that the eigen-decomposition transformation can be pre-calculated to deal with dif-ferent logic gates placement. Because the eigenfunctions f00(x, y) depends on the covariance

kernel for physical parameters random processes rather than design placement.

These equations, equation (3.10), equation (3.11), equation (3.12), and equation (3.14) form a set of deterministic heat transfer equations, and its solutions can be formulated based on

(37)

Algorithm Sub Power Projection Procedure

Input: The constants b₁∗, b∗₂, {f_i00∗}, ˜L∗, and polynomial Φk(~ξ) Output: Return the value of< eB∗PN

0 kl i=1f 00∗ i ζi+b∗2q ∗_(~_ζ) Φk(~ξ) > where B∗ = b∗₁+ 2b∗₂L˜∗, q∗(~ζ) = {PN 0 kl i=1f 00_∗ i ζi}2 1 Begin

2 if the polynomial Φk(~ξ) contains ςi

3 Return 0

4 Endif

5 do Transform the quadratic form q∗(~ζ) = ~ζT_A~_{ζ to standard form ~}_νT_D~_ν,

6 ~ν = [ν1, ..., ν_N00 kl

], QT_{AQ = D,linear transform ~}_{ζ = Q~}_ν,

7 obtain the eigenvalues {λA

i } of A, 8 D_s∗=QN 00 kl i e 2−4b∗₂λA_i (B∗Ci)2 (1−2b∗ 2λAi)0.5 , Ci =P N_kl00 j=1Qijf 00_∗ j . 9 if the order of Φ0_k(~ν) = 1 10 Return B∗Ck 1−2b∗ 2λAk D_s∗ 11 endif

12 elseif the order of Φ0_k(~ν) = 2

13 Return (B ∗_C k)2−2b∗2λAk+1 (1−2b∗ 2λAk)2 D_s∗ 14 endif 15 End

Fig. 3.4: Procedure of sub power projection

several existing techniques. In this work, we apply the analytic technique (Generalized Integral Transforms) [14] to serve as the deterministic solver. Once the set of coefficients { ˆTk(r, t)} be

obtained, mean value and variance of the temperature distribution can be solved. The mean value and variance of temperature distribution can be expressed as:

E{T (r, t, ~ξ)} = Tˆ0(r, t) (3.27) V ar{T (r, t, ~ξ)} = Npc X i=1 ˆ T2 i (r, t) < Φ 2 i(~ξ) > (3.28)

(38)

Chapter 4 Experimental Results

The proposed stochastic thermal simulator is implemented as a tool in C++ on HPxw9300 workstation with 16GB memory. The simulation results are obtained on the our tested place-ment benchmark. We create a leakage power gate level cell library based on HSPICE simulation on TSMC 65nm technology model. The nominal value of oxide thickness is set to 1.4nm and the 3σ value of parameters variations for channel length L and oxide thickness Tox are set to

20% of the nominal parameter values. The ratios of correlation length to chip size for x-dir (ηx/Lx) and y-dir (ηy/Ly) were set to 0.31 that means the correlation between two devices are

located half of chip dimension away in either direction is 0.2. The generated tested placement benchmark with about four millions gate counts from the floorplanning shown in Fig. 4.1 is based on the cell library at 65nm technology. The experimental results can be summarized as the following:

(39)

(40)

• Accuracy and Efficiency

The accuracy and efficiency of proposed stochastic thermal simulator can be shown in Table 4.1. We demonstrate the accuracy and efficiency of this work in comparison with Monte Carlo Method with 100000 samples. We found that 100000 samples is the rea-sonable number of the Monte Carlo simulation. Actually, our proposed method leads to about 1% of errors in both mean and standard deviations for choosing that Nkl is equal

to 75 and the order of polynomial chaos is equal to 1 and only takes four minutes. Noted that N_kl0 and N_kl00 are set equal to Nkl. The errors of mean value is strongly depended on

Nklrather than the order of PC. The errors of standard deviation relys on not only Nklbut

only the order of PC.

Table 4.1: Accuracy and Efficiency Compared to Monte Carlo Method

Nkl Npc PC E{T } σ{T } Run Speedup

Order MaxErr(%) MaxErr(%) Time(s) (X) 25 51 1 4.74 9.4 83.83 1960 50 101 1 2.52 3.3 165.79 990 75 151 1 1.72 1.09 247.4 662 20 461 2 4.87 7.68 949.25 264 25 701 2 4.2 6.43 1339.58 143 30 991 2 3.67 1.41 2039.07 100

• Compared to the Deterministic Thermal Simulation

As shown in Fig. 4.3 and Fig. 4.4, there is 18% difference between the deterministic nominal power consideration and stochastic power consideration in the average temper-ature in our tested circuit. The deterministic simulation that underestimated the tem-perature profile and hottest value of temtem-perature can not offer designers a robust solution. The cost function of traditional thermal-aware floorplanning, placement, and optimization methodology is often avoiding the hot spot and smoothing the thermal profile. However, the deterministic cost and objective function is not reliable enough in deep sub-micron technology. The stochastic thermal simulator conducts designers to not only minimize the mean value of temperature to avoid high temperature failure but also reduce the vari-ance of temperature to lower thermal gradients. In addition, solutions of stochastic

(41)

ther-mal simulator support designers to develop circuits with more tolerance of manufactured process variations.

(a) (b)

Fig. 4.2: (a) The nominal power distribution at the top surface of die, (b) The mean power distribution at the top surface of die.

(42)

(a) (b)

Fig. 4.3: (a) The 3D nominal temperature distribution at the top surface of die, (b) The 3D mean temperature distribution at the top surface of die.

(a) (b)

Fig. 4.4: (a) The 2D nominal temperature distribution at the top surface of die, (b) The 2D mean temperature distribution at the top surface of die.

(43)

• Considering Spatial Correlation Compared to Neglecting Spatial Correlation

We demonstrated the difference between considering spatial correlation and ignoring spa-tial correlation based on Monte Carlo simulation in our tested circuit. From the simulation results, we observe that the mean value of temperature of method with considering spatial correlation is close to the method without considering spatial correlation. However, the variance of temperature profile between the two methods is different entirely . As shown in Fig. 4.5, the results reveal that the variance of temperature distribution is larger in the method which considers spatial correlation. Moreover, the variance value of surround-ing the region with larger variance value is also larger. This shows that the variance of temperature profile behaves with circumfluent phenomenon and spatial correlation. This is because the leakage powers of neighbor region are less correlated when spatial cor-relation is ignored. For considering spatial corcor-relation, the leakage powers of neighbor region are with positive interactions. From simulation solutions of the tested circuit in Fig. 4.5(a) and Fig. 4.5(b), the method with considering spatial correlation 3X − 4X times the variance of the one without considering spatial correlation. The thermal simu-lator without considering spatial correlation will undervalue the thermal gradients. Large temperature gradients may not only reduce lifetime of chip but also cause thermal stress to crack circuits. [20].

(44)

(a) (b)

Fig. 4.5: (a) The 3D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b) The 3D standard deviation temperature distribution at the top surface of die without considering spatial correlation.

(a) (b)

Fig. 4.6: (a)The 2D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b)The 2D standard deviation temperature distribution at the top surface of die without considering spatial correlation.

(45)

• Temperature Yield Estimation

The yield loss will worsen in future technologies due to the continued significance of leakage powers and increasing process variations. Another trouble observation is that increased variations on leakage powers not only cause a larger spread of temperature distribution but also higher average temperature distribution. It is worth to note that most current thermal simulation approaches do not consider process variations and are unaware of their impact on yield. These deterministic thermal simulation approaches result in yield loss due to increased susceptibility to process variations The evaluating on thermal yield can be formulated by the following Generalized Chebyshev Inequality [35].

COROLLARY 4.1 If E[X] = µ, Var(X)= σ2, then for a > 0

P {X ≥ µ + a} ≤ σ 2 σ2_{+ a}2 (4.1) P {X ≤ µ − a} ≤ σ 2 σ2_{+ a}2 (4.2)

Using the Generalized Chebyshev Inequality, we can obtain the tighter bounds of temper-ature distribution. The probability of tempertemper-ature distribution under the given tempertemper-ature constraint can be estimated by using the generalized inequality. This gives the designers a guideline and solutions to deal with yield issues for manufactured process variations. Fur-thermore, these solutions can serve as a stochastic thermal solver which is incorporated into other CAD design methodologies such as timing and reliability analysis methods. It can be shown in Fig. 4.7 that the probability of temperature distribution under upper bound which is mean value plus three standard deviation is larger than 90%.

(46)

(47)

• Hot Spot Observations

Generally speaking, the general purpose of thermal-aware placement and floorplanning is to minimize the maximal temperature gradient over the chip and to avoid the hot spot oc-currence. The deterministic thermal simulators offer the explicit hot spot location to CAD designer. However, the variations in leakage power result in the on-chip temperature dis-tribution with uncertainties. It’s difficulty to indicate the hot spot location under process variations. In our tested circuit, we observe that the hot spot location can not merely be determined by traditional deterministic thermal solver. As shown in Fig. 4.4(a), the hot spot location is at the point (x = 0 (mm), y = 5 (mm)) based on deterministic thermal simulator. However, it can be shown in Fig. 4.8 that all the points of the line at y = 5 (mm) have probability to become hot spot location. Therefore, the traditional deterministic thermal simulators are unaware of the precise hot spot location.

It’s crucial to precisely indicate the hot spot location for designers under manufactured process variations. Our future work is developing a robust scheme to determine hot spot distribution. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 70 80 90 100 110 120 130 140 150

The temperature profile of y = 5 (mm)

x (mm) T em p er at ur e Mean value Mean value - σ Mean value + σ ℃

(48)

Chapter 5 Conclusion

In this work we have proposed a stochastic thermal simulation procedure to estimate the statis-tical temperature distribution in the presence of within-die process variations. Our experimental results show that the proposed method has small variation errors and high efficiency compared to Monte Carlo simulation. The simulation results indicate that simulation without considering spatial correlation may underestimate existing critical thermal gradient. We further point out that the traditional deterministic thermal simulator will lose information of thermal profile in manufactured process variations and guide designer to a optimistic way. The stochastic ther-mal simulator we proposed offers a robust estimation of temperature distribution and serves as a necessary machine for thermal-aware methodology with considering within-die process variations.

(49)

Bibliography

[1] Brian E. Stine, Duane S. Boning, and James E. Chung, “Analysis and Decomposition of Spatial Variation in Integrated Circuit Processes and Devices,” IEEE Trans. on

Semicon-ductor Manufacturing, 10(1), Feb., 1997, pp.24-41.

[2] D. Okumu Ouma, Duane S. Boning, James E. Chung, William G. Easter, Vivek Saxena, Sudhanshu Misra, , and Annette Crevasse, “Characterization and Modeling of Oxide Chem-ical MechanChem-ical Polishing Using Planarization Length and Pattern Density Concepts,” IEEE

Trans. on Semiconductor Manufacturing, 15(2), 2002.

[3] M. Orshansky, L. Milor, P. Chen, K. Keutzer, and C. Hu, “Impact of spatial chip gate length variability on theperformance of high-speed digital circuits,” IEEE Trans. on

Computer-Aided Design of Integrated Circuits and Systems, 2002, pp.544-553

[4] J. Xiong, V. Zolotov, and L. He, “Robust Extraction of Spatial Correlation,” Proceedings of

the ACM ISPD, 2006.

[5] F. Liu, “A General Framework for Spatial Correlation Modeling in VLSI Design,”

Proceed-ings of the ACM DAC, 2007.

[6] S. Bhardwaj, S. Vrudhula, P. Ghanta, and Y. Cao, “Modeling of Intra-die Process Variations for Accurate Analysis and Optimization of Nanoscale Circuits,” Proceedings of the ACM

DAC, 2006.

[7] P. Friedberg, Y. Cao, J. Cain, R. Wang, J. Rabaey, and C. Spanos, “Modeling Within-Die Spatial Correlation Effects for Process-Design Co-Optimization,” Proceedings of the

ISQED, 2005.

(50)

[9] R. G. Ghanem, and P. D. Spanos, “Stochastic Finite Elements: A Spectral Approach,”

Springer-Verlag, 1991.

[10] R. Rao, A. Srivastava,E, David Blaauw, and D. Sylvester, “Statistical Analysis of Sub-threshold Leakage Current for VLSI Circuits,” IEEE Trans. on Very Large Scale Integration

Systems, vol. 12, NO. 2, February 2004.

[11] W. Huang, M. R. Stan, K. Skadron, K. Sankaranarayanan, S. Ghoshyz, and S. Velusamyz, “Compact Thermal Modeling for Temperature-Aware Design,” Proceedings of the ACM

DAC, 2004.

[12] W. Huang, S. Ghoshyz, S. Velusamyz, K. Sankaranarayanan, K. Skadron, and M. R. Stan, “HotSpot: A Compact Thermal Modeling Methodology for Early-Stage VLSI Design,”

IEEE Trans. on Very Large Scale Integration Systems, vol. 14, May 2006.

[13] http://www-device.eecs.berkeley.edu/ ptm/

[14] P. Y. Huang, C. K. Lin, Y. M. Lee, “Full-Chip Thermal Analysis for the Early Design Stage via Generalized Integral Transforms,” Proceedings of the ACM ASPDAC, January 2008. [15] R. H. Cameron, W. T. Martin, “The orthogonal development of nonlinear functionals in

series of Fourier-Hermite functionals,” Ann. of Math., 1947.

[16] T. Y. Wang and C. C. P. Chen, “Thermal-ADI: A Linear-Time Chip-Level Thermal Simu-lation Algorithm Based on Alternating-Direction Implicit (ADI) Method,” IEEE Trans. on

Very Large Scale Integration Systems, vol. 11, no. 4, pp. 691-700, August 2003.

[17] T. Y. Wang and C. C. P. Chen, “SPICE-Compatible Thermal Simulation with Lumped Cir-cuit Modeling for Thermal Reliability Analysis Based on Model Reduction.” Proceedings

of the ACM ISQED, 2004.

[18] P. Li, L. T. Pileggi, M. Asheghi, and R. Chandra, “IC Thermal Simulation and Model-ing via Efficient Multigrid-Based Approaches,” IEEE Trans. on Computer-Aided Design of

考慮晶片上具有空間相關製程變異的統計型晶片熱分佈模擬器

國

立

交

通

大

學

電 信 工 程 學 系

碩 士 論 文

考慮晶片上具有空間相關製程變異的

統計型晶片熱分佈模擬器

Stochastic Thermal Simulator Considering Within-die

Spatial Correlation under Process Variations

研 究 生：吳佳鴻

指導教授：李育民 教授

考慮晶片上具有空間相關製程變異的

統計型晶片熱分佈模擬器

學生:吳佳鴻 指導教授:李育民 博士

國立交通大學電信工程學系碩士班

摘 要

傳統上晶片上熱分佈分析主要是考慮不具隨機性的功率消耗的熱傳方程式，

然而，隨著製程的演進，導致在參數如電晶體通道長度和氧化層厚度的變異

波動對於電路的效能、功率消耗、可靠度上有重大的影響。在晶片設計階段

時忽略製程上的變異將會造成嚴重的良率問題。在這篇論文，我們提出一個

方法分析晶片上統計型溫度分析考慮晶片上具有空間相關的製程變異。這篇

論文是第一篇考慮晶片上具有空間相關製程變異的統計型晶片熱分佈模擬

器，利用卡洛轉換(Karhunent-Loeve transformation)處理具有空間相關隨機過

程並且利用正交多項式(Polynomial Chaos)和隨機加勒金法(Stochastic Galerkin

method)解統計型熱傳方程式。與蒙地卡羅模擬法(Monte Carlo simulation)比較

來說明我們所提出方法的正確性和效率性。模擬的結果可以保證提供可靠的

溫度分佈良率，並且指引設計者去避免晶片熱毀壞的問題在次微米半導體時

代。最後我們更指出提供精確的晶片上溫度分佈不能忽略空間相關製程變異。

Stochastic Thermal Simulator Considering Within-die

Spatial Correlation under Process Variations

Student : Jia-Hong Wu Advisor : Dr. Yu-Min Lee

Department of Communication Engineering

National Chiao Tung University

ABSTRACT

Traditionally, the thermal analysis methods of chip have been conducted by

solving the heat transfer equation with deterministic heat sources. However, the

technology scaling leads to that the fluctuations in physical parameters such as

channel length and oxide thickness have a substantial impact on circuit

performance, power consumption, and reliability. Ignoring the manufactured

process variations at the design stage can cause aggravated yield losses. In this

paper, we present a method to analyze the statistical temperature distribution of

full chip under considering process variations with a known within-die spatial

correlation function. To the author’s best knowledge, this is the first stochastic

thermal simulator of full chip with considering within-die process variations. This

work makes use of the Karhunen-Loeve transformation to deal with the physical

parameters with spatial correlation and takes advantage of polynomial chaos and

stochastic Galerkin method to tackle the stochastic heat transfer equation. We

demonstrate the accuracy and efficiency of the proposed methodology in

comparison to Monte Carlo simulation. The simulation results guarantee the

robust thermal yield and can guide designers to avoid the thermal failure in

nano-meter technology. Furthermore, we point out that the within-die spatial

correlation can not be neglected for the accurate temperature estimation.

誌 謝

這篇論文能夠順利地完成，首先要由衷地感謝我的指導教授 李

育民博士，每當我困惑徬徨時，老師總能指引我一條光明的路，以致

能夠讓這篇論文完成。在這碩士的兩年裡，實驗室與老師所給予我的

訓練和教導，對於我之後就業或是進修的影響，是具有相當大的部份

，在實驗室的這兩年，我發現我已成長不少，更學習到許多研究與處

理問題的方法，最後我要感謝老師與實驗室在這兩年來所帶給我的一

切。

關於實驗的進行，相當感謝李義明老師能夠提供我們台積電 65

奈米製程參數，如果沒有李義明老師的幫忙，實驗的進行將會變的相

當困難。在此深深致上對李老師的謝意。

在實驗室裡，感謝培育學長、柏毅學長寶貴的知識經驗傳承，以

及國富哥與實驗室學弟焯基、斯安、懷中、宗祐平日的關心與幫忙，

紓解生活壓力，一路相伴與成長。

最後要深深地感謝我的父母親，你們的辛苦及無悔的付出栽培我

能夠順利完成碩士學業，也感謝 Doci 的陪伴與關懷。僅在此將本論

文獻給你們，共享這份喜悅與榮耀。

Contents

List of Figures

List of Tables

Chapter 1

Introduction

1.1

Introduction

電信工程學系

碩士論文

研究生：吳佳鴻

指導教授：李育民教授

學生:吳佳鴻指導教授:李育民博士

摘要

誌謝

這篇論文能夠順利地完成，首先要由衷地感謝我的指導教授李