針對3D整合之電子設計自動化技術開發---子計畫一：三維度積體電路的隨機電熱模擬及其對功率最佳化的應用(II)

(1)

行政院國家科學委員會專題研究計畫成果報告

針對 3D 整合之電子設計自動化技術開發--子計畫一：三維

度積體電路的隨機電熱模擬及其對功率最佳化的應用(2/2)

研究成果報告(完整版)

計畫類別：整合型

計畫編號： NSC 99-2220-E-009-035-

執行期間： 99 年 08 月 01 日至 100 年 07 月 31 日

執行單位：國立交通大學電信工程學系（所）

計畫主持人：李育民

計畫參與人員：碩士班研究生-兼任助理人員：吳宗恆

碩士班研究生-兼任助理人員：李亭蓉

碩士班研究生-兼任助理人員：王志升

博士班研究生-兼任助理人員：黃培育

博士班研究生-兼任助理人員：魏書含

博士班研究生-兼任助理人員：潘麒文

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫涉及專利或其他智慧財產權，2 年後可公開查詢

中華民國 100 年 10 月 30 日

(2)

1

針對3D整合之電子設計自動化技術開發

子計畫一：三維度積體電路的隨機電熱模擬及其對功率最佳化的應用 (2/2)

Stochastic Electro-Thermal Simulation for 3-D ICs and Its Application to 3-D IC

Power Optimization

計畫編號：

NSC

99-2220-E-009-035-執行期間：99 年 8 月 1 日至 100 年 7 月 31 日

計畫主持人：

李育民

一、中文摘要

目前針對三維度積體電路功耗優化技術鮮少討論到多重電壓供應技巧。本計

畫利用多重電壓供應技巧以降低三維度積體電路功耗，發展的方法包括三大部分。

（1）三維度積體電路的電源電壓分配:分配的方法考慮三個方面的因素-靈敏度、

鄰近效應和電壓位準移位器的預算;（2）三維度積體電路電熱分析:得到三維度

積體電路之溫度分佈;（3）考慮熱感知的靜態時序分析: 以分析三維度積體電路

的延遲。實驗結果驗證了此多重電壓供應技術的有效性。

關鍵詞：三維度積體電路；電熱模擬；功率最佳化；低功率設計；多重電

壓設計

二、英文摘要

Few of existing works on power reduction in 3D ICs discuss the ability of

supply voltage scaling techniques for power optimization. In this work, a supply

voltage assignment based power reduction method for minimizing the power

consumption of 3D ICs is presented. The proposed approach includes three major

headings: (1) 3D IC Voltage Assignment for power reduction with including three

factors--sensitivity, proximity effect and level shifter (LS) budget; (2) 3D

Electro-Thermal Analysis for getting the temperature distribution of 3D IC; (3)

Thermal Aware Static Timing Analysis for obtaining thermal-related delay values of

functional gates. The experimental results demonstrate the effectiveness of the

developed voltage assignment method in power reduction.

Keywords：3-D IC, Electro-Thermal Analysis, Power Optimization, Low Power

Design, Multiple Supply Voltage Design

(3)

三、研究計畫之背景及目的

In recent years, many researchers have shown that 3D ICs can provide the

powerful enhancement of system integration. However, the heat removal is a great

challenge in 3D IC design due to the high power density and the low thermal

conductivities of inter-layer dielectrics. Moreover, the high temperature induces

serious impacts on the timing, power and reliability of circuit design [1]. Therefore, it

is necessary to reduce the power consumption of circuits for mitigating the thermal

problem of circuit.

Among the existing power reduction techniques, the multiple supply voltage

(MSV) [2-5] is an effective technique to reduce dynamic power and leakage power.

Intuitively, any MSV techniques of 2D ICs can be extended to 3D ICs and the voltage

scaling can be performed tier by tier. However, without simultaneously considering

every tier in voltage assignment procedure might lead the power consumption

distribution of each layer to be off-balance and cause the thermal problem.

In this work, based on [6], a grid-based post-placement MSV method will be

developed for reducing the 3D IC power consumption and will be extended to explore

more possibilities. This developed technique will simultaneously consider the level

shifter (LS) budget and the thermal effect. Compared with previous works ignoring

the LS issue and using the thermal-unrelated models, the proposed approach is more

flexible and practical.

The report is organized as follows. First, the proposed power optimization

methodology is presented in section、四. After that, experimental results are given in

section、五. Finally, some conclusions are drawn in section、六.

四、

研究方法

(4)

3

The proposed post-placement MSV based method for power reduction during the

3D IC design flow is shown in Fig. 1. Given a known 3D IC design placement,

netlist, cell library and timing/leakage power cell library, for the grid-based

procedure, each tier is partitioned into n grids as illustrated in Fig. 2.(a). First, the

initial supply voltage of all gates is set to be the high supply voltage V

DDH

, and

the initial temperature of chip is obtained by 3D Electro-Thermal Analysis

(section、四.A). Then, the Thermal Aware Static Timing Analysis (section、四.B)

is performed with the temperature-related delay of gate got from the initial

temperature in 3D Electro-Thermal Analysis step. After that, the Initial Voltage

Assignment (section、四.C) is executed, and the circuit timing might be violated

due to aggressively assigning low supply voltage V

DDL

to all gates. Then, a

grid-based procedure is developed for the 3D IC Voltage Assignment (section、

四.D) that assigns an appropriate V

DD

for trading off the power consumption

penalty and the timing saving advantage. After executing the voltage assignment

procedure once, the power consumption and the delay of each gate are changed;

hence, the thermal and timing analysis should be done again to update the

temperature-related delay of gate and leakage power. 3D IC Voltage Assignment

is executed until no grid can be selected or timing violation is rescued (section、

四.E). The voltage assignment result got by the proposed method can be applied

to any voltage island generators. Moreover, 3D IC Voltage Assignment method

could be more beneficial for the voltage island generation because of considering

the proximity effect.

Fig. 2. A three-tier design example of the grid-based procedure for generating the

voltage assignment

(5)

A. 3D Electro-Thermal Analysis

Generally, the dynamic power is independent of temperature but the leakage

power is significantly affected by temperature. Based on the empirical

model [7], the gate leakage current I

gate

is related to the oxide thickness, and

the subthreshold current I

sub

is related to the channel length and temperature.

Fixing the oxide thickness and the channel length, the subthreshold current

model of gate can be built by utilizing the least square fitting method to the

estimated results of HSPICE under the 90nm technology as follow.

(1)

(2)

Here, s

0

and s

1

are fitting constants, and T is the operating temperature of

gate/cell.

Since the fitting constants and the supply voltage are dependent, a look-up

table is set up to store them for different supply voltages, V

DDL

and V

DDH

.

With (1) and (2), the gate tunneling leakage power and the subthreshold

leakage power are

(3)

(4)

The 3D Electro-Thermal Analysis is performed with an electro-thermal

iterative updating loop which is built by integrating (3)-(4), the 3D IC

statistically thermal simulator [6] and the 3D IC analytical thermal

simulator [8].

B. Thermal Aware Static Timing Analysis

The thermal-aware static timing analysis (STA) is a conventional

block-based STA. Each gate delay is thermal dependent and is built as a

canonical first-order form by applying the least square fitting method to fit

the Monte Carlo results of HSPICE under 90nm technology. The general

delay expression form of a specific gate is

(5)

Here, a

0

-a

4

are V

DD

dependent fitting constants, and a look-up table is

constructed to store these coefficients for V

DDH

and V

DDL

. C

L

is the load

capacitance.

C. Initial Voltage Assignment

Given a timing satisfied circuit with the initial supply voltage being V

DDH

,

an Initial Voltage Assignment procedure is performed to save the most

power without considering the timing constraints. All gates are assigned to

operate at V

DDL

’s, and this step might cause the timing violation. After the

(6)

5

each gate. With the updated operating temperature of each gate from 3D

Electro-Thermal Analysis and (5), the updated delay of each gate can be

obtained. Then, the updated arrival time (AT), the updated required arrival

time (RT) and the slack of each gate are calculated. Finally, a gate is

referred to as a site if the slack of that gate is found to be negative, and a set

of sites is output to the next step.

D. 3D IC Voltage Assignment

The algorithm of 3D IC Voltage Assignment is shown in Fig. 3. Given a set

of sites from Initial Voltage Assignment, the Grid-Based Procedure is

executed to decide which grid should be firstly picked, and then sites in this

chosen grid are assigned to operate at V

DDH

to rescue timing. Then, Voltage

Re-Assignment is performed by selecting several sites in the selected grid to

operate at V

DDH

for timing rescue. The above procedure is repeat until none

of sites in the selected grid can be re-assigned to operate at V

DDH

, or all sites

in this grid have been rescued successfully.

Fig. 3. Algorithm of 3D IC voltage assignment

E.1、 Grid-Based Procedure

We improve the idea of [6] to perform a grid-based procedure to decide

which grid is firstly picked, and then sites in this grid are assigned to

operate at V

DDH

to meet timing constraints under the complex three

dimensional structure. The goal of this decision is to effectively trade off

the power consumption penalty and the timing saving advantage.

(7)

2. Firstly, the weight of each grid is determined by considering three factors:

1) sensitivity factor, 2) proximity factor, 3) LS budget factor. Then, the

three dimensional structure is vertically compressed into a two dimensional

planar illustrated in Fig. 2.(b), and the weight of each compressed grid is

obtained by accumulating the weight of each grid of z-axis. After that, the

compressed grid with the maximum accumulated weight is selected, and

this grid-based decision step is finished. Finally, as shown in Fig. 2.(c), the

selected compressed grid is restored back to the original three layer

structure. This procedure helps us to decide which grid is firstly assigned

for the next stage Voltage Re-Assignment. The weight of a grid i is defined

as

(6)

where

i

is the sensitivity factor and LS budget factor, and

i

is the

proximity factor. There are two main concerns for the definition of

i

. The

first concern (S

site

) is how to make an assignment decision can obtain the

most rescue of timing and the least penalty of power saving. The second

concern λ

i

is how to make a assignment decision can lead to fewer LS

overheads (# level shifters). The

i

factor considers clustering. We hope

more gates in clusters in a single grid, and all gates in a cluster operate at

the same V

DD

. The sensitivity factor

i

and proximity factor

i

are defined

as follows.

(7)

(8)

where

is the sum of site sensitivities S

site

’s in grid i, N

i

is the number

of all gates in grid i,

is the number of gates with V

DDH

,

is the

number of gates with V

DDL

, and λ

i

is the LS budget factor that indicates

whether the estimated

is appropriate.

As the operating voltages of all sites in the selected grid are determined, the

needed number of level shifters is determined, and λ

i

can be defined as

(9)

Here,

is the needed number of level shifters if all sites are assigned to

operate at V

DDH

in grid i, and

is the available number of level shifters

in grid i that is estimated by the attainable white space of grid i.

Finally, the grid sensitivity

and the site sensitivity

are defined

as follows.

(8)

7

(10)

(11)

where

and

are

the delay and the power dissipation difference between V

DDL

and V

DDH

,

respectively;

is the slack of site j, and P

LS

is the power of LS.

E.2、 Voltage Re-Assignment

Based on the decision of grid-based procedure, a Voltage Re-Assignment

procedure is performed to obtain the best timing rescue and the least power

saving penalty within the selected grid i. Two factors, sensitivity and LS

budget, are considered for selecting the site to operate at V

DDH

. To start with,

the site with the maximum sensitivity is selected. Then, the number of usage

level shifters is checked if the site is assigned to operate at V

DDH

. If the

number of usage level shifters is larger than the LS budget of the selected

grid, the selected site is not assigned to operate at V

DDH

, and a new site with

the second largest sensitivity is selected until the site meets these two

constraints at the same time. After that, the selected site in grid i is assigned

to operate at V

DDH

for rescuing the timing. When the site is assigned to

operate at V

DDH

, the timing and sensitivity information of gates are affected

and should be updated. The number of sites in grid

i

can be reduced during

the assignment procedure. After the updating procedure, the next site with

the maximum sensitivity is selected, and the assignment step is executed

repeatedly. In this step, the selection and updating procedure are executed

repeatedly until no site in grid i, or all sites in grid i have been selected and

assigned. Therefore, the maximum possible times for executing the

re-assignment procedure are equal to the initial number of sites in grid i, and

the computation load of updating sensitivity is reduced.

(9)

E. Timing and LS Budget Rescue

E.1、 Timing Rescue

The 3D IC Voltage Assignment result might lead the circuit timing to be

violated because the LS budget is limited. Hence, we can not arbitrarily

assign V

DDH

to the site. Therefore, a timing rescue is executed to rescue the

timing of circuit. For example, the gates with red color are sites and the

gates with blue color are operating at V

DDL

as shown in the circuit schematic

of Fig. 4.(a). First, we compute the gain of each site and sort them. The gain

is the delay difference between V

DDH

and V

DDL

. The site with the maximum

gain is selected firstly and assigned to operate at V

DDH

. Then, the fan-in

gates of the site are selected. For example, we firstly select the fan-in gate C

with the maximum AT to operate at V

DDH

, and update its timing information.

If the maximum AT is changed from gate C to gate D after gate C is

assigned to operate at V

DDH

, we must assign V

DDH

to gate D again. In this

way, all gates of fan-in of the site are checked until the dominant gate is

found. Although the timing is rescued, the number of level shifters might be

over the LS budget due to executing the timing rescue procedure without

considering LS budget. Therefore, we have to sacrifice the power

consumption by assigning more gates with V

DDL

to be V

DDH

for reducing the

number of level shifters. In the following, we are going to perform our

method for rescuing the level shifter usage to meet the LS budget.

E.2、 LS Budget Rescue

Intuitively, we can start with the fan-in gate of the LS, and assign V

DDH

to it.

This way not only reduces the usage of level shifters but also maintains the

correctness of circuit. In Fig. 4.(a), the gates with yellow color are level

shifters, the red color gates operate at V

DDH

, and the blue color gates operate

at V

DDL

. The LS budget rescue step finds out all level shifters, and checks

whether their fan-in gates with V

DDL

can be changed to V

DDH

for reducing

the number of level shifters. For example, Fig. 4.(a) shows that gate C is the

fan-in of LS (No. 1 and No. 2), and gate D is the fan-in of LS (No. 3). The

gain of gate C is equal to -2+1=-1, and the gain of gate D is equal to

-1+2=1. Obviously, gate C reduces one LS but gate D increases one LS.

Therefore, we should select gate C to operate at V

DDH

for reducing the usage

of level shifters as shown in Fig. 4.(b). Similarly, the gains of gate A and D

in Fig. 4.(b) are checked again. The rescue procedure stops until no gate can

be assigned to operate at V

DDH

for reducing the usage of level shifters.

(10)

9

五、實驗方法與結果

We implement our proposed method in C++ language and apply the algorithm to

a set of ISCAS89 benchmark circuits and private designs. First, Design Compiler

is used to synthesize the benchmark circuits with the UMC 90nm standard cell

library. Next, the initial 2D placement of each test circuit is generated by the

SOC Encounter, and its related 3D placement is obtained by transforming the 2D

placement with Z—Place [9]. The timing/leakage power cell library with

temperature effect is generated by evaluating the average leakage current and the

delay of gate based on H-SPICE simulation for various types of logic gates.

After getting the H-SPICE simulation results, based on the least square fitting

method, the fitting constants of the leakage current and delay models are

obtained. For simplicity, a single type of LS is used in the experimental results.

A. Comparison of Voltage Assignment Results

TABLE I lists the results of voltage assignment in different phases. The

number of sites after Initial Voltage Assignment is listed in column 4.

Columns 5-6 show the number of sites and the usage of level shifters after

3D IC Voltage Assignment. The results of Timing Rescue and LS Budget

Rescue are listed in columns 7-8 and column 9, respectively. Finally, the

summarization is listed in columns 10-11.

First, the results of 3D IC Voltage Assignment show that the timing rescue

after initial voltage assignment is limited significantly by the LS budget.

Most circuits cannot meet timing constraints after 3D IC Voltage

Assignment procedure under the limited LS budget constraint. Next, to deal

with this problem, we try to rescue the sites of circuit by Timing Rescue.

After executing Timing Rescue procedure, all circuits meet timing

constraints finally. However, the usage of level shifters is more than LS

budget because the timing rescue procedure only considers the influence of

voltage assignment on timing and does not take LS budget into

consideration. Finally, we try to rescue the circuit again by LS Budget

Rescue. The results of column 11 show that it still has three circuits that are

rescued unsuccessfully.

For reducing the problem size, 3D IC Voltage Assignment method limits the

voltage assignment decision to the sites, and obviously the number of sites

should not be large in a circuit. Moreover, we think that the site is the most

important gate for rescuing the timing. However, based on sites, it is not

enough for the timing rescue because of the LS budget constraint.

(11)

TABLE I. Results of the Proposed Voltage Assignment Method

B. Power Reduction

TABLE II summarizes the optimization results. TABLE I shows that three

circuits are failed, and six circuits are successful after executing the

proposed power reduction method. Therefore, the average of improvement

is the average of the six circuit results. The initial power and average

temperature of the circuit are listed in columns 2-4, and the power and

average temperature of the circuits and the power of level shifters after

optimization are listed in columns 5-8. The improvement percentages of the

dynamic power, the leakage power and the total power are listed in columns

9-11, respectively. The temperature decrement of each circuit is listed in

column 11. The improvement columns indicate that the proposed 3D IC

Voltage Assignment method can provide almost 33.50% total power saving

and 26.34 degree decrement of temperature in average. It can be observed

that the leakage power reduction is greatly improved because of the

temperature decrement.

TABLE III shows the leakage power and temperature estimation with the

simulated temperature in columns 2-3 and without the simulated

temperature in columns 4-5, and the percentage differences of leakage

power in column 6. As shown in TABLE III, the full chip leakage power

analysis without accurate temperature can lead to 53.88% error in average.

If the leakage power is underestimated, the power reduction can be

dominated by the dynamic power, which is quite impractical.

(12)

11

TABLE III. Leakage Power Estimation

六、

結論與討論

In this work, a 3D IC Voltage Assignment method with the combination of

selecting grid by Grid-Based Procedure and Voltage Re-Assignment is proposed to

minimize the total power consumption of 3D IC design. By employing the

temperature-related gate delay and leakage power models, the more accurate

estimation of circuit performance can be obtained. Although it has three unsuccessful

circuits, the experimental results have shown a great power reduction by the proposed

method.

七、成果

[1] Huai-Chung Chang, Pei-Yu Huang, Ting-Jung Li, and Yu-Min Lee,

“Statistical Electro-Thermal Analysis with High Compatibility of Leakage

Power Models,” International SoC Conference (SOCC), 2010.

[2] Shu-Han Whi and Yu-Min Lee, “Dual Supply Voltage Assignment in 3D ICs

Considering Thermal Effects,” The 16th Workshop on Synthesis And

System Integration of Mixed Information Technologies (SASIMI), 2010.

[3] Yu-Min Lee and Chi-Wen Pan, “Redundant Via Insertion with Wire

Spreading Capability,” International Journal of Electrical Engineering

(IJEE), vol. 17, no. 6, pp. 383-398, December 2010.

[4] Chi-Wen Pan, Yu-Min Lee and Chih-Sheng Wang, “Redundant Via

Insertion under Timing Constraints,” International Symposium on Quality

Electronic Design (ISQED), 2011.

[5] Shu-Han Whi and Yu-Min Lee, “Supply Voltage Assignment for Power

Reduction in 3D ICs Considering Thermal Effect and Level Shifter Budget,”

International Symposium on VLSI Design, Automation and Test

(VLSI-DAT), 2011.

(13)

[6] Pei-Yu Huang and Yu-Min Lee, “Statistical Hot-Spot Identification Using

On-Chip Thermal Yield Profile,” VLSI Design/CAD Symposium

(VLSI/CAD), 2011.

[7] Pei-Yu Huang and Yu-Min Lee, “On-Chip Statistical Hot-Spot Estimation

Using Mixed-Mesh Statistical Polynomial Expression Generating and

Skew-Normal Based Moment Matching Techniques,” Accepted by Asia

South Pacific Design Automation Conference (ASPDAC), 2012.

八、

參考文獻

[1] V. Reddy and A. T. Krishnan. Impact of negative bias temperature

instability on digital circuit reliability. Proceedings of IRPS, pages 248-254,

2002.

[2] S. H. Kulkarni and A. N. Srivastava. A new algorithm for improved VDD

assignment in low power dual VDD systems. Proceedings of ISLPED,

pages 200-205, 2004.

[3] H. Wu and I. M. Liu. Post-placement voltage island generation under

performance requirement. Proceedings of ICCAD, pages 309-316, 2005.

[4] R. L. S. Ching and E. F. Y. Young, E.F.Y. Post-placement voltage island

generation. Proceeding of ICCAD, pages 641-646, 2006.

[5] H. Wu and M. D. F. Wong. Timing-constrained and voltage-island-aware

voltage assignment. Proceedings of DAC, pages 432, 2006.

[6] S. A. Yu and P. Y. Huang and Y. M. Lee. A multiple supply voltage based

power reduction method in 3-D ICs considering process variations and

thermal effects. Proceedings of ASPDAC, pages 55-60, 2009.

[7] H. F. Dadgour and S. C. Lin. A statistical framework for estimation of

full-chip leakage-power distribution under parameter variations. IEEE

Transactions on Electron Devices, 54(11):2930-2945, 2007.

[8] P. Y. Huang and Y. M. Lee. Full-chip thermal analysis for the early design

stage via generalized integral transforms. IEEE Transactions on Very Large

Scale Integration Systems, 17(4):613-626, 2009.

[9] R. Hentschke and G. Flach. 3D-vias aware quadratic placement for 3D

VLSI circuits. Proceedings of ISVLSI, pages 67-72, 2007.

(14)

表 Y04

行政院國家科學委員會補助國內專家學者出席國際學術會議報告

99 年 10 月 14 日

報告人姓名

李亭蓉

服務機構

_及職稱

國立交通大學電信工程系(所)

碩士

時間

會議

地點

99.9.27-99.9.29

美國、內華達州

(Nevada, US)

本會核定

補助文號

NSC 98-2220-E-009 -058 -

NSC 99-2220-E-009 -035 -

會議

名稱

(中文)第 23 屆國際系統晶片會議

(英文) 23

rd

IEEE International SoC Conference (SOCC 2010)

發表

論文

題目

(中文) 對功率模型具有高度相容性的統計型電熱分析

(英文) Statistical Electro-Thermal Analysis with High Compatibility of

Leakage Power Models

報告內容應包括下列各項：

一、參加會議經過

本次會議除了口頭論文發表、海報論文發表之外；主辦單位並且邀請不同領域的專家

針對不同的系統晶片設計考量方向給與前瞻的演講。整個會議中聽取了功率考量專家剖

析現在及未來能量的降低之於系統晶片、微處理器及計算系統的發展趨勢，並與數個發

表論文的作者及相關業界的學者專家討論研究議題內容。

此研討會一直以來都是受研究單位矚目的國際會議之一，吾人非常榮幸有機會參加此

次2010年在美國內華達州(Nevada, U.S.A)舉行的會議。會議分為口頭發表及壁報發表兩

部分，此國際會議中發表之文章不只範圍廣且技術先進，因此參加此會議不僅可增進自

己研究領域之知識，亦可了解現今系統晶片的趨勢。

我們在此次會議發表論文為對功率模型具有高度相容性的統計型電熱分析。論文全文

請參見附件。

二、與會心得

參加本屆國際系統晶片會議，令我獲益良多。不僅吸收到眾多設計系統晶片上考量的

方針與趨勢；如，為節省能源消耗及手持(handheld)便利性綠系統("Green" System)。經

由口頭論文發表會，接觸到不同子領域中解決相似問題的演算法，並且獲得其他專家們

所提出的改進方向與建議。

此次會議中主要有幾部分：系統晶片的能源最佳化技術及電路、類比電路、系統設計

方法、通訊電路系統和嵌入式記憶體系統。

三、建議

與會人士除了歐美國家外，韓國、日本、及中國都是積極參與國際的會議，如果往

後能多鼓勵參予類似活動，對於國際交流與合作上會有很大的幫助，也可以藉由接觸

國外學者獲得更廣的國際觀，增加研究的能力。

四、攜回資料名稱及內容

會議論文海報集光碟：集合發表於此研討會議中所有論文及海報內容。

會議手冊：所有演講、論文及海報的摘要還有會議議程。

附

件

三

(15)

國科會補助專題研究計畫項下出席國際學術會議心得報告

日期：100年 3 月20日

一、參加會議經過

本次 2011 年 ISQED 的舉辦地點為美國的聖克拉拉，我們所發表的論文：

Redundant Via Insertion under Timing Constraints，很榮幸在 ISQED 的論文甄選中，成

為被選中的論文之一。在這三天的會議中，將近有 14 個 workshops 在這段時間內進

行，也因為時間緊迫，很多 workshop 都是同時進行。因此，在這段時間內可以自由

選擇自己有興趣的題目去參加會議，瞭解到現在電子設計自動化上的發展趨勢以及

一些相關應用成果，同時也可以知道其他學者所發表的論文成果。

計畫編號

NSC99－2220－E－009－035

計畫名稱

針對 3D 整合之電子設計自動化技術開發－子計畫一：三維度積體電路

的隨機電熱模擬及其對功率最佳化的應用(2/2)

出國人員

姓名

潘麒文

服務機構

及職稱

國立交通大學電信工程研究所 SOC 組

博士班三年級

會議時間

100 年 3 月 14 日至

100 年 3 月 16 日

會議地點

美國聖克拉拉(

Santa Clara

)

會議名稱

(中文) 電子設計品質會議

(英文)11th International Symposium on Quality Electronic Design

發表論文

題目

(中文)在時序限制下的冗餘接點安插

(16)

些同時進行的 session，不過像是 demo 和 poster 這類的 session，就可以先去 demo

的會場看每一個的主題，在那邊聽作者的解釋、說明、示範，也有人提供實機讓你

借出會場去測試使用。大致看完 demo 後，可以再去外面看 poster，幾乎每個海報都

會有作者在旁邊解說。雖然這兩場是同時進行，可是時間安排剛剛好，不會讓人有

一下子就看完的感覺。另外像是 Best papers 的 session，雖然他是在一個很大的會議

廳，可是參加人數也非常的多，太晚到甚至有可能找不到座位。

本人是在會議的最後一天上台報告，當天有 4 個 session 同時進行，因此每間會

議室的人都不像之前那麼多。此外，在會議進行時有發生一點小意外，就是其中一

位演講者的投影片數據變成亂碼，當時該位教授還開玩笑說是因為水土不服嗎，也

讓人感受到隨機應變的重要性，也很欽佩演講者沒有被投影片失常而影響後面的報

告。在同一場 session 中碰到幾個同樣來自交大的學生，也算是一個很特別的經驗。

二、與會心得

這次出國參加 ISQED 2011，是本人第一次參加大型的國際研討會，也是第一次

去美國，很高興也很榮幸有這個機會可以去參加研討會。在研討會的中途休息時間

可以看到有部份學者、研究人員在討論剛剛聽得演講內容，不禁讓人覺得佩服，也

期許自己可以像他們一樣。

由於是第一次參加大型研討會，很多事情都不知道該怎麼做，幸好同行者有這

方面的經驗，很多事情都請他幫忙處理，在這過程中也瞭解到事先多做一點功課，

之後到美國那邊就比較輕鬆了。美國地大物博，在聖克拉拉如果沒有交通工具，真

的是行動不便，好加在還有一些鐵路系統，可以帶我們到比較繁榮的市區走走‧在

飯店裡，可以看到來自不同國家的人，我發現歐美的人都比較熱情，會主動向我們

(17)

打招呼，而華人通常都是一群一群的自成一個團體‧

在參與別的 session 時，會去另外注意別人製作簡報的方式，研究如何使用簡

單明瞭的方法讓聽眾很快的吸收、理解。也因為這樣，一直到上台報告的前幾天，

我還是一直在調整自己的簡報，使用一些動畫的方式來強調重點，希望讓聽眾能夠

較易瞭解。可惜的是有部份 session 是同時進行，無法聽完所有的演講，因此只能

從演講主題下去挑選要參加的研討會。

住在旅館時，旅館有提供無線網路的服務，可是在房間內使用常常會覺得訊號

差以及連線不穩定，無法從網路查資料，這也突顯了事先準備的重要性‧這趟行程

讓我見識到大型國際研討會的規模與水準，以及來自各地的研究人員，深感國際競

爭的壓力，我們必須各加倍專注於我們的學術研究‧

三、考察參觀活動(無是項活動者略)

四、建議

可以在國內多舉辦類似的大型國際研討會，並邀請一些國會知名學者出席演

講，以提升會議的規模與水準，與世界接軌‧

五、攜回資料名稱及內容

11

th

ISQED 光碟x1 : 內含本次會議的所有論文資料

六、其他

(18)

--- Forwarded Message ---

Sent: Sat, 26 Jun 2010 06:14:30 -0700

Subject: Your SOCC 2010 Submission (Number 68)

Dear Prof. Yu-Min Lee:

On behalf of the SOCC 2010 Program Committee, we are pleased to inform you that

the following submission has been accepted to appear at the conference as a regular

paper:

Statistical Electro-Thermal Analysis with High

Compatibility of Leakage Power Models

Please revise your paper according to the the reviews. Your final manuscript will

appear in the proceedings. The manuscript is limited to SIX pages. The deadline

for submission is Friday July 9, 2010.

To upload your final manuscript, please visit the following

site:

https://www.softconf.com/b/socc2010/

and, on the left-hand side of the page, enter the passcode associated with your

submission. Your passcode is as follows:

68X-F8B3B5H7B5

Alternatively, you can click on the following URL, which will take you directly to a

form to submit your final paper:

https://www.softconf.com/b/socc2010/cgi-bin/scmd.cgi

?

scmd=aLogin&passcode=68X-F8B3B5H7B5

The reviews and comments are attached below. Please try to follow the reviewers'

advice when you revise your paper.

(19)

Congratulations on your fine work. If you have any additional questions, please feel

free to get in touch.

Best Regards,

(20)

--- 原文 ---

主旨

_{: Your ISQED 2011 Submission (Number 238)}

寄件者

日期

_{: 四, 十一月 25, 2010 7:01 am}

收件者

---

Dear Mr. Chi-Wen Pan:

On behalf of the ISQED 2011 Program Committee, I am delighted to inform you that

the following submission has been accepted to appear at the conference:

Redundant Via Insertion under Timing Constraint

The Program Committee worked very hard to thoroughly review all the submitted

papers. Please repay their efforts, by following their suggestions when you revise

your paper.

To upload your final manuscript, please visit the following

site:

https://www.softconf.com/b/isqed2011/

and, on the left-hand side of the page, enter the passcode associated with your

submission. Your passcode is as follows:

238X-F5G3P6H9C5

Alternatively, you can click on the following URL, which will take you directly to a

form to submit your final paper:

https://www.softconf.com/b/isqed2011/cgi-bin/scmd.cgi?scmd=aLogin&passc

ode=238X-F5G3P6H9C5

The reviews and comments are attached below. Again, try to follow their advice

when you revise and improve the quality of your paper.

(21)

free to get in touch.

Best Regards,

Kamesh Gadepally - ISQED2011 TPC Chair

Keith Bowman , ISQED2011 TPC Co-Chair

ISQED 2011

(22)

STATISTICAL ELECTRO-THERMAL ANALYSIS WITH HIGH COMPATIBILITY OF

LEAKAGE POWER MODELS

Huai-Chung Chang, Pei-Yu Huang, Ting-Jung Li and Yu-Min Lee

National Chiao Tung University, Hsinchu, Taiwan

Abstract— In this work, a statistical electro-thermal analyzer with high compatibility of power model is developed. The developed analyzer takes both the easily implementing advantage of Monte Carlo method and the fast convergent advantage of stochastic analysis method to effectively solve the statistical electro-thermal problem. Experimental results indicate that the developed electro-thermal analyzer can be orders of magnitude faster than the Monte Carlo method under the same accuracy level. The computational time is only1.16 seconds for a design with over one million gates, and the maximum errors are only0.34% and 1.84%, compared with the Monte Carlo method, for estimating the mean and the standard deviation proﬁles of full-chip temperature distribution, respectively.

I. INTRODUCTION

Power dissipation and thermal effect are important issues of VLSI design as the technology continuously scales down, and the power density rapidly increases. The chip-temperature profiles and gradients significantly influence on IC performance, reliabil-ity, and package cost. Because leakage power contributes a large portion of total power in the modern technology, it is necessary to model and estimate leakage power accurately. Furthermore, the leakage power of a circuit element exponentially depends on its operating temperature and process parameters. Hence, process variations and thermal impacts need to be cautiously considered. In recent years, several thermal-power related analysis methods have been proposed. In the power analysis, [1]–[3] quantified process variations of leakage power. Nevertheless, none of them simultaneously considers the statistical power and the electro-thermal effect. In the electro-thermal analysis, [4] proposed a deter-ministic electro-thermal analyzer considering the temperature dependence of leakage power.

To include process variation effects, the electro-thermal simu-lation needs to be considered as a statistical fashion to ensure design reliability. Hence, several statistical thermal analyzers were developed [5], [6]. However, [5] didn’t consider thermal coupling. Though [6] presented a statistical electro-thermal analysis, it needs to re-fit the leakage power model as the design or its geometry changes because its model fits the leakage power of each temperature grid rather than that of each gate. This limits its usage for early physical design stages. Moreover, both of them need specified leakage power models for the power projection [5] and the iteratively log-normal approximation [6].

Because the scaling technology can lead more complicated leakage power models for enhancing the accuracy, it is urgent to develop a statistical thermal analyzer with the high capability of accurate but complicated leakage power models. Compared with [5], [6], our developed statistical electro-thermal analyzer is more applicable since we take the advantage of sparse grid collo-cation technique [7] to avoid the convoluted statistical calculation algorithm. The sparse grid collocation technique has been adopted in thermal-power related researches such as building leakage

power models [3] and analyzing statistical leakage power [2]. However, both of [2], [3] didn’t consider and indicate how to treat temperature dependence issues in their power analysis methods. In this work, we will present how to easily, accurately and efficiently solve the statistical electro-thermal problem with any temperature-dependent leakage power models. Moreover, rather than [6], the developed electro-thermal analyzer doesn’t need to re-fit leakage power models during thermal-driven early physical design stages such as floorplanning or placement because the cell based leakage power models are adopted. Firstly, the Karhunen-Loève (KL) expansion is used to transform spatially correlated physical parameters to a set of uncorrelated random variables. Then, the Smolyak sparse grid formulation [7] is applied to obtain the sampling values of physical parameters for obtaining the deterministic power models in executing deterministic electro-thermal simulations. After a set of deterministic electro-electro-thermal simulations being solved, the Newton interpolating formula is utilized to calculate the expression coefficients of temperature profile. Finally, the statistical characteristics of temperature dis-tribution can be extracted.

Our major contributions are

1) This work presents an easily, accurately and efﬁciently statistical electro-thermal simulation, and it has the high compatibility to incorporate any power models.

2) The developed statistical electro-thermal analyzer can ac-curately and efﬁciently provide the mean and standard deviation proﬁles of full-chip temperature distribution. 3) Experimental results reveal that ignoring electro-thermal

coupling in statistical thermal analysis can lead to signiﬁ-cant errors of full-chip temperature distribution.

This paper is organized as follows. Firstly, the leakage power modeling and the problem formulation are described in section II. After that, the proposed statistical electro-thermal analyzer is de-tailed in section III. Finally, experimental results and conclusion are given in sections IV and V, respectively.

II. LEAKAGEPOWERMODELING ANDPROBLEM

FORMULATION

A. Leakage Power Modeling

Many leakage power models were developed in [1], [2], [5], [6], [8]. However, none of them in [1], [2], [5] simultaneously considered temperature and process variation effects. Hence, their accuracy degrades as the technology scales down. For the authors’ best knowledge, only [6], [8] simultaneously considered both effects. Nevertheless, the leakage current model in [8] was based on 90nm technology. Hence, as the technology advances, its accuracy deteriorates shown in TABLE I. A grid-based leakage power model was developed in [6]. Each ﬁtted model was used to coarsely approximate the total leakage current in each grid, and this limits its usage after the ﬂoorplanning stage.

(23)

TABLE I

ERROR COMPARISON OFIsANDIgWITH THE RESULTS OFHSPICEUNDER65nmTECHNOLOGY FOR ANNANDGATE.

fg Max Error Avg. Error Error> 3% Without temperature Tox, Tox2, Lch, L2ch[1] 6.48% 2.70% 4.37%

With temperature Our adopted model: a polynomial function constructed by Lch, T , Toxand Tox2 1.55% 0.29% 0.00%

fs Max Error Avg. Error Error> 3%

Without Lch, L2ch, Tox−1, Tox2 [1] 347.32% 70.65% 98.27%

temperature Lch, L2ch, Tox−1, Tox, Tox2, Tox/Lch, Lch/Tox, ToxLch [2] 314.13% 70.52% 100.00%

With Lch, T, Tox[8] 32.23% 8.73% 76.62%

temperature Our adopted model: a 3rd_{order polynomial function completely expanded by L}

ch, Toxand T 1.31% 0.19% 0.00% Here, a cell-based leakage power ﬁtting model including the

process variation effect and temperature dependence is presented. Firstly, for each cell, different input patterns, various physical pro-cess parameter values and operating temperatures are combined and put into HSPICE with industrial design kit under the BSIM4 model to generate its leakage current data. After that, the average leakage currents of input patterns are fitted by the least square fitting method. Finally, the fitted coefficients of different average leakage current models such as the average subthreshold leakage (Is) and the average gate tunneling leakage (Ig) can be obtained.

Since Isis the off-state leakage, and Ig occurs in both on and

off states of transistor, the cell leakage power can be written as Pleakage = Vdd× (Ig+ (1 − Sw) Is) , (1)

where

Ig = a0· efg(Tox,Lch,T ), (2)

Is = b0· efs(Tox,Lch,T ). (3)

Here, a0and b0are ﬁtted constants, Lch and Toxare the channel

length and oxide thickness, respectively. T is the operating temperature, Sw is the switching activity, Vdd is the supply

voltage, and fg(Tox, Lch, T ) and fs(Tox, Lch, T ) are speciﬁc

ﬁtting forms1_{. I}

g is modeled as exponentially dependent on

temperature since it is exponentially affected by the threshold voltage [9].

The accuracy of several existing leakage current models has been investigated [1], [2], [8]. Because they do not present enough accuracy, much more accurate leakage current models are proposed in this work. The error comparison of existing leakage current models and the proposed leakage current models for an two-input NAND gate is shown in TABLE I.

As shown in TABLE I, different fg and fs lead to different

errors compared with the results of HSPICE. The drastic errors of [1], [2], [8] are because of the ignorance of either temperature or developing technology. The maximum error and average error of proposed models are less than 1.55% and 0.29%, respectively. Actually, for all cell types given by an industrial design kit, the maximum error and average error of our developed leakage current models are only 1.55% and 0.5%, respectively.

With the above demonstration, the leakage current (power) model might be very complicated for achieving acceptable ac-curacy. This fact indicates that the statistical power analyzer or the statistical thermal analyzer should have the ability to handle complicated leakage current (power) models.

1_{We consider the variations of the device channel length and the oxide thickness} since the leakage power is very sensitive to them [1]. It should be noted that although only these two parameters are considered, our framework can be easily extended to include the effects of any process variation parameters such as the channel dopant variation, etc.

Fig. 1. Compact thermal model of physical design.

B. Problem Formulation

The compact thermal model of a chip for the physical design stage is shown in Fig. 1 [10]. The primary heat ﬂow path is composed of thermal interface material, heat spreader and heat sink. The secondary heat ﬂow path involves interconnect layers, I/O pads, and the print circuit board. The functional blocks are modeled as many power sources attached to the thin layer close to the top surface of die. The main heat sources consist of the dynamic and leakage power consumed by devices. Because the dynamic power is insensitive to process variations and operating temperature [1], it is viewed to be deterministic. However, the leakage power is strongly dependent on process parameters and operating temperature. Hence, the leakage power is viewed as random processes [1], and the thermal coupling effect needs to be considered for the full-chip temperature distribution analysis. By combining the compact thermal model and the statistical power consumption considering the thermal coupling effect, the steady state temperature distribution T (r, θ, ) of die is determined by the following statistical steady-state heat transfer equation. ∇·(κ(r, T )∇T (r, θ, ))=−p(r, Lch(x, y, θ), Tox(x, y, ), T ), (4)

subject to the following boundary condition κ(rbs, T )

∂T (rbs, θ, ) ∂nbs

+ hbsT (rbs, θ, ) = fbs(rbs). (5) Here, ∇ is the diverge operator, and κ(r, T ) is the thermal conductivity of die. The p(r, Lch(x, y, θ), Tox(x, y, ), T )

is the random process of power density profile which consists of the dynamic power density profile pd(r), the sub-threshold leakage power density profile

ps(r, Lch(x, y, θ), Tox(x, y, ), T ), and the gate leakage

power density proﬁle pg(r, Lch(x, y, θ), Tox(x, y, ), T ). The

r = (x, y, z) ∈ D, D = (0, Lx) × (0, Ly) × (−Lz, 0) is the

domain of die, Lx and Ly are lateral sizes of die, and Lz

is the thickness of die. The θ and are sampling values of manufacturing outcomes ΩLch and ΩTox for the channel length and oxide thickness, respectively. The Lch(x, y, θ) and

(24)

Fig. 2. The ﬂowchart of proposed statistical electro-thermal analyzer.

length and the oxide thickness, respectively. The bs is any

speciﬁc boundary surface of the die, and rbs is the position

located on bs. The hbs is the heat-transfer coefﬁcient on bs,

fbs(rbs) is the heat ﬂux on bs, and ∂/∂nbs is the differential

operator along the outward direction normal to bs. Since the

major part of device current passes through the region close to the channel, the power density proﬁle has its value only in that region which its thickness is equal to the junction depth for dynamic and sub-threshold leakage power and is equal to the Debye length for gate tunneling leakage power.

With equations (4)–(5), our goal is to evaluate the mean and variance proﬁles of steady-state full-chip temperature distribution.

III. PROPOSEDSTATISTICALELECTRO-THERMALANALYZER

The ﬂowchart of proposed statistical electro-thermal analyzer is shown in Fig. 2. Each operation in Phase 1 is only related with the technology node rather than design pattern, and operations of

Phase 2 are design dependent.

In Phase 1, given a spatial covariance function of physical pa-rameters, the KL expansion is employed to decompose correlated parameters into a set of uncorrelated random variables. After that, the Smolyak sparse grid formula is used to generate sparse grids, which are a set of sampling random vectors of KL expanded and inter-die random variables. In Phase 2, with each sampled random vector on sparse grids, a deterministic electro-thermal simulation with the deterministic power profile obtained by this sampled random vector is performed. Then, with all thermal profiles corresponding to sampled random vectors on sparse grids, an approximated representation of stochastic full-chip temperature distribution is obtained by the Newton interpolating formula. Finally, the statistical characteristics, such as mean and variance profiles, of the full-chip temperature distribution are extracted.

Different from the existing statistical thermal/electro-thermal analyzers [5], [6], the proposed framework can easily, accurately and efﬁciently obtain an approximated expression of the full-chip temperature distribution without suffering from complicated statistically calculating algorithms such as the power projection [5] and the iteratively statistical temperature moment extraction [6]. This is because each power proﬁle corresponding to each sampled random vector is deterministic during each determin-istic electro-thermal analysis being performed. Hence, accurate but complicated leakage power models can be adopted in this framework. Each step in Fig. 2 is detailed in the rest subsections.

A. Parameter Transformation

Generally, process variations of one physical parameter P

can be classiﬁed into intra-die Pintra _{and inter-die P}inter

variations which both can be modeled as Gaussian random

variables [1]. The physical parameter P ∈ {Tox, Lch} with its

expected value P at position rxy= (x, y) can be written as

Tox(rxy, ) = Tox(rxy)+ΔToxintra(rxy, i)+ΔToxinter(rxy, j) , (6)

Lch(rxy, θ) = Lch(rxy)+ΔLintrach (rxy, θi)+ΔLinterch (rxy, θj) . (7)

The i and j are subsets of , and θi and θj are subsets of

θ.

According to [1], Tox(rxy, ) = Tox(x, y, ) is assumed

to be spatially uncorrelated2_{. Because the spatial correlation of}

ΔLintra

ch (rxy, θi) might have different decreasing rates in x- and

y-directions, the spatial covariance function proposed in [11] is

adopted for ΔLintra

ch (rxy, θi)3. Given σ as the standard deviation

of target random process, and correlation lengths ηx and ηy in

x- and y-directions, respectively, the spatial covariance function

between two random variables at points rx1y1 and rx2y2 is

C(rx1y1, rx2y2) = σ2e−

|x1−x2|

ηx e−|y1−y2|ηy . (8)

With applying the KL expansion, ΔLintra

ch (rxy, θi) based on (8) can be approximated as ΔLintra ch (rxy, θi) ≈ NLch m=1 √_χ mqm(rxy)ζm(θi). (9)

Here, χm’s are eigenvalues of C(rx1y1, rx2y2), qm’s are related

eigenvectors, and NLch is the expansion length. {ζm(θi)} is the

set of uncorrelated standard normal random variables.

Because of the KL expansion property, the expanded random variables are Gaussian random variables if the target random

process is Gaussian, and the eigen-pair (χm, qm(rxy)) closed

form can be derived [12]. In the rest of this paper, ζ = {ζm}

and ς = {ςn} are sets of random variables to represent Lch and

Tox, respectively, ˜ξ = ζ ∪ ς, and θ and are dropped for the

sake of notation simplicity.

B. Smolyak Sparse Grid Formulation

The basic idea of Smolyak sparse grid formulation is to build an interpolating approximation of a high dimensional

multivariate-function u ∈ Cr _{by much less sampling values}

of the desired function than the full tensor product interpola-tion formula but with an acceptable error bound in the order

of O(M−r_{log M}(d−1)(r−1)_{) [13]. Here, M is the number of}

sampling points, and d is the number of variables.

For the Monte Carlo method, the random variable samples are randomly generated, and a large number of samples is re-quired to achieve accurate mean and variance estimation. For the Smolyak sparse grid formulation, the random variable samples are generated by using roots of Hermite polynomial chaos (H-PCs) or extrema of the Chebyshev polynomial [14], and the desired solution is obtained by using interpolation with these samples.

2_{Although T}_ox_{is assumed to be spatially uncorrelated, the proposed simulation}

mechanism still works for Toxbeing spatially correlated.

3_{Although we choose this speciﬁc spatial covariance function (8), any valid}

spatial covariance functions can be adopted

(25)

The high order interpolating approximation can be achieved with a small number of samples [13].

According to the Smolyak sparse grid formulation [7], our desired full-chip statistical temperature distribution T (r, ˜ξ) rep-resented by a set of KL expanded random variables ˜ξ can be explicitly approximated as [15] ˜ Td q(r, ˜ξ) = q−d+1≤|i|≤q (−1)q−|i|d − 1 q − |i| (Qi1_{(T ) ⊗· · ·⊗ Q}id_{(T )),} ₍₁₀₎

where d is the number of random variables in ˜ξ, q is the level of desired solution, Qin with the level i_n ≥ 1 is the one-dimensional interpolating operator of T (r, ˜ξ) with respect to the n-th random variable in ˜ξ, ⊗ is the functional cross product, and |i| = i1 + · · · + in + · · · + id. The level in is the index

to decide the number of sampling values for the interpolating polynomial Qin. As suggested in [16], the relation between the number of sampling values min and the level in is m1= 1 and mijn= 2in−1+ 1 for in > 1.

From (10), we only need to know the temperature on the following small set of sampling values for ˜ξ [17]. The sparse grid, the set of sampling values of ˜ξ, in (10) is derived as

H (q, d) =

q−d+1≤|i|≤q

ϑi1_{× · · · × ϑ}in_{× · · · × ϑ}id_, ₍₁₁₎

where ϑin denotes the set of sampling points of ˜ξ_n, and ‘×’ is the cross product of the points of set.

The number of sampling points from Smolyak sparse grid formulation increases as O( dq−d

(q−d)!) that is less severe than that

of full tensor product formulation. The runtime complexity of our proposed statistical electro-thermal analyzer can be analyzed to be O(Cdet_(q−d)!dq−d ). The Cdet is the runtime complexity for

executing a deterministic electro-thermal simulation.

The sampling values corresponding to ϑin must be properly decided. Adopting the roots of H-PCs with its order being corresponding to the level incan achieve the most accurate result

if ˜ξ is a set of normal random variables [14]. On the other hand, adopting the extrema of the Chebyshev polynomial with its order being corresponding to the level incan achieve the nested sparse

grid structure for any levels and acceptable accuracy [16]. In this paper, we adopt the roots of H-PCs in our experimental implementation because the results are shown to be very accurate by using the low level approximation, and the nested sparse grid structure is still preserved for q = d + 14_.

C. Calculation of Temperature Proﬁles on Sparse Grids After the sparse grid H(q, d) being obtained, the samples of channel length and oxide thickness corresponding to the m-th sampling grid ˜ξm _{of H(q, d) can be obtained by using the}

parameter modeling technique stated in section III-A. Hence, the deterministic power density proﬁle corresponds to ˜ξm _{can be}

obtained. With the deterministic power density proﬁle, we have the following deterministic steady heat transfer equation.

∇ ·κ(r, T )∇T (r, ˜ξm₎_{= −p(r, ˜ξ}m_{, T ),} ₍₁₂₎

4_{If a highly order approximation is required for the accuracy, we suggest the} extrema of the Chebyshev polynomial because its nested sparse grid structure is preserved for any levels; hence, it needs much less sampling points than that of choosing the roots of the H-PCs for highly order approximation.

Algorithm Calculation of Temperature Proﬁles on Sparse Grid Input:Sampling point ˜ξi, initial temperature Tiniand p_dyn(r) Output:_{Stable temperature proﬁles T (r, ˜}_ξi_{) of ˜}_ξi

1 Begin

2 Obtain Tox(rxy, ˜ξi) and Lch(rxy, ˜ξi) according to ˜ξi; 3 T (r, ˜ξi_{) ← T}ini_;

4 T_{(r, ˜}_ξi_{) ← 0;}

5 While (T (r, ˜ξi_{) − T}_{(r, ˜}_ξi_{) ≤ Converging criterion)} 6 T_{(r, ˜}_ξi_{) ← T (r, ˜}_ξi_);

7 Update pleakage(r, ˜ξi, T ) by T (r, ˜ξi);

8 ptotal(r, ˜ξi, T ) ← pleakage(r, ˜ξi, T ) + pdyn(r); 9 † Solve deterministic thermal equations (12) and (13)

with ptotal(r, ˜ξi, T ) to obtain a new T (r, ˜ξi); 10 if(T (r, ˜ξi) = Inﬁnite) then Thermal runaway; 11 Return T (r, ˜ξi₎

12 End

† The deterministic thermal analyzer [18] is used to obtain T∗_. Any deterministic thermal analyzer can be used here.

Fig. 3. Deterministic electro-thermal analysis for each sampling point in sparse grid. pleakage, pdynand ptotalare the leakage, dynamic and total power density proﬁles for each sampling point of sparse grid, respectively.

subject to the following boundary condition

κ(rbs, T )

∂T (rbs, ˜ξm) ∂nbs

+ hbsT (rbs, ˜ξm) = fbs(rbs). (13)

Here, p(r, ˜ξm_{, T ) and T (r, ˜ξ}m_{) are deterministic power density}

and temperature proﬁles with respect to ˜ξm_{, respectively. Since}

the power density proﬁle is temperature dependent in equa-tion (12), the deterministic electro-thermal analysis is used to get each T (r, ˜ξm_{) and is summarized in Fig. 3.}

D. Polynomial Interpolation of Temperature Distribution Instead of directly using equation (10) which requires to obtain different Qi1(T ) ⊗· · ·⊗ Qid(T ) for each different |i| = i₁+· · ·+ id, we take the advantage of nested sparse grid structure and then

perform one time of Newton interpolating method [14] to globally interpolate T (r, ˜ξ) by the deterministic temperature proﬁles of all sampling values in sparse grid. For the sparse grid that can not preserve nested structure, the Newton interpolating method can be applied to obtain each different Qi1(T ) ⊗· · ·⊗ Qid(T ).

Based on the Newton interpolating formula, the temperature at the speciﬁed die position r∗ _{can be approximated as}

T (r∗_{, ˜ξ) ≈}m=N m=0

ˆam(r∗)φm(˜ξ), (14)

where φm(˜ξ) is an interpolating polynomial with respect to the

m-th sampling value ˜ξm_{, and its form can be found in [14]. The}

N = |H(q, d)| − 1, |H(q, d)| is the number of sampling values in sparse grid, and ˆam(r∗)s need to be determined.

Based on the basic idea of interpolation that the approximated function must match each known data, the interpolated polyno-mial in (14) must satisfy equation (15) for each ˜ξk_.

m=N m=0

ˆam(r∗)φm(˜ξk) = T (r∗, ˜ξk). (15)

針對3D整合之電子設計自動化技術開發---子計畫一：三維度積體電路的隨機電熱模擬及其對功率最佳化的應用(II)

行政院國家科學委員會專題研究計畫 成果報告

針對 3D 整合之電子設計自動化技術開發--子計畫一：三維

度積體電路的隨機電熱模擬及其對功率最佳化的應用(2/2)

研究成果報告(完整版)

計 畫 類 別 ： 整合型

計 畫 編 號 ： NSC 99-2220-E-009-035-

執 行 期 間 ： 99 年 08 月 01 日至 100 年 07 月 31 日

執 行 單 位 ： 國立交通大學電信工程學系（所）

計 畫 主 持 人 ： 李育民

計畫參與人員： 碩士班研究生-兼任助理人員：吳宗恆

碩士班研究生-兼任助理人員：李亭蓉

碩士班研究生-兼任助理人員：王志升

博士班研究生-兼任助理人員：黃培育

博士班研究生-兼任助理人員：魏書含

博士班研究生-兼任助理人員：潘麒文

報 告 附 件 ： 出席國際會議研究心得報告及發表論文

處 理 方 式 ： 本計畫涉及專利或其他智慧財產權，2 年後可公開查詢

中 華 民 國 100 年 10 月 30 日

針對3D整合之電子設計自動化技術開發

子計畫一：三維度積體電路的隨機電熱模擬及其對功率最佳化的應用 (2/2)

Stochastic Electro-Thermal Simulation for 3-D ICs and Its Application to 3-D IC

Power Optimization

計畫編號：

NSC

99-2220-E-009-035-執行期間：99 年 8 月 1 日 至 100 年 7 月 31 日

計畫主持人：

李育民

一、中文摘要

目前針對三維度積體電路功耗優化技術鮮少討論到多重電壓供應技巧。本計

畫利用多重電壓供應技巧以降低三維度積體電路功耗，發展的方法包括三大部分。

（1）三維度積體電路的電源電壓分配:分配的方法考慮三個方面的因素-靈敏度、

鄰近效應和電壓位準移位器的預算;（2）三維度積體電路電熱分析:得到三維度

積體電路之溫度分佈;（3）考慮熱感知的靜態時序分析: 以分析三維度積體電路

的延遲。實驗結果驗證了此多重電壓供應技術的有效性。

關鍵詞：三維度積體電路；電熱模擬；功率最佳化；低功率設計；多重電

壓設計

二、英文摘要

Few of existing works on power reduction in 3D ICs discuss the ability of

supply voltage scaling techniques for power optimization. In this work, a supply

voltage assignment based power reduction method for minimizing the power

consumption of 3D ICs is presented. The proposed approach includes three major

headings: (1) 3D IC Voltage Assignment for power reduction with including three

factors--sensitivity, proximity effect and level shifter (LS) budget; (2) 3D

Electro-Thermal Analysis for getting the temperature distribution of 3D IC; (3)

Thermal Aware Static Timing Analysis for obtaining thermal-related delay values of

functional gates. The experimental results demonstrate the effectiveness of the

developed voltage assignment method in power reduction.

Keywords：3-D IC, Electro-Thermal Analysis, Power Optimization, Low Power

Design, Multiple Supply Voltage Design

三、研究計畫之背景及目的

In recent years, many researchers have shown that 3D ICs can provide the

powerful enhancement of system integration. However, the heat removal is a great

challenge in 3D IC design due to the high power density and the low thermal

conductivities of inter-layer dielectrics. Moreover, the high temperature induces

serious impacts on the timing, power and reliability of circuit design [1]. Therefore, it

is necessary to reduce the power consumption of circuits for mitigating the thermal

problem of circuit.

Among the existing power reduction techniques, the multiple supply voltage

(MSV) [2-5] is an effective technique to reduce dynamic power and leakage power.

Intuitively, any MSV techniques of 2D ICs can be extended to 3D ICs and the voltage

scaling can be performed tier by tier. However, without simultaneously considering

every tier in voltage assignment procedure might lead the power consumption

distribution of each layer to be off-balance and cause the thermal problem.

In this work, based on [6], a grid-based post-placement MSV method will be

developed for reducing the 3D IC power consumption and will be extended to explore

more possibilities. This developed technique will simultaneously consider the level

shifter (LS) budget and the thermal effect. Compared with previous works ignoring

the LS issue and using the thermal-unrelated models, the proposed approach is more

flexible and practical.

The report is organized as follows. First, the proposed power optimization

methodology is presented in section、四. After that, experimental results are given in

section、五. Finally, some conclusions are drawn in section、六.

四、

研究方法

The proposed post-placement MSV based method for power reduction during the

3D IC design flow is shown in Fig. 1. Given a known 3D IC design placement,

netlist, cell library and timing/leakage power cell library, for the grid-based

procedure, each tier is partitioned into n grids as illustrated in Fig. 2.(a). First, the

initial supply voltage of all gates is set to be the high supply voltage V

行政院國家科學委員會專題研究計畫成果報告

計畫類別：整合型

計畫編號： NSC 99-2220-E-009-035-

執行期間： 99 年 08 月 01 日至 100 年 07 月 31 日

執行單位：國立交通大學電信工程學系（所）

計畫主持人：李育民

計畫參與人員：碩士班研究生-兼任助理人員：吳宗恆

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫涉及專利或其他智慧財產權，2 年後可公開查詢

中華民國 100 年 10 月 30 日

99-2220-E-009-035-執行期間：99 年 8 月 1 日至 100 年 7 月 31 日