行政院國家科學委員會專題研究計畫 成果報告
針對 3D 整合之電子設計自動化技術開發--子計畫一:三維
度積體電路的隨機電熱模擬及其對功率最佳化的應用(2/2)
研究成果報告(完整版)
計 畫 類 別 : 整合型
計 畫 編 號 : NSC 99-2220-E-009-035-
執 行 期 間 : 99 年 08 月 01 日至 100 年 07 月 31 日
執 行 單 位 : 國立交通大學電信工程學系(所)
計 畫 主 持 人 : 李育民
計畫參與人員: 碩士班研究生-兼任助理人員:吳宗恆
碩士班研究生-兼任助理人員:李亭蓉
碩士班研究生-兼任助理人員:王志升
博士班研究生-兼任助理人員:黃培育
博士班研究生-兼任助理人員:魏書含
博士班研究生-兼任助理人員:潘麒文
報 告 附 件 : 出席國際會議研究心得報告及發表論文
處 理 方 式 : 本計畫涉及專利或其他智慧財產權,2 年後可公開查詢
中 華 民 國 100 年 10 月 30 日
1
針對3D整合之電子設計自動化技術開發
子計畫一:三維度積體電路的隨機電熱模擬及其對功率最佳化的應用 (2/2)
Stochastic Electro-Thermal Simulation for 3-D ICs and Its Application to 3-D IC
Power Optimization
計畫編號:
NSC
99-2220-E-009-035-執行期間:99 年 8 月 1 日 至 100 年 7 月 31 日
計畫主持人:
李育民
一、中文摘要
目前針對三維度積體電路功耗優化技術鮮少討論到多重電壓供應技巧。本計
畫利用多重電壓供應技巧以降低三維度積體電路功耗,發展的方法包括三大部分。
(1)三維度積體電路的電源電壓分配:分配的方法考慮三個方面的因素-靈敏度、
鄰近效應和電壓位準移位器的預算;(2)三維度積體電路電熱分析:得到三維度
積體電路之溫度分佈;(3)考慮熱感知的靜態時序分析: 以分析三維度積體電路
的延遲。實驗結果驗證了此多重電壓供應技術的有效性。
關鍵詞:三維度積體電路;電熱模擬;功率最佳化;低功率設計;多重電
壓設計
二、英文摘要
Few of existing works on power reduction in 3D ICs discuss the ability of
supply voltage scaling techniques for power optimization. In this work, a supply
voltage assignment based power reduction method for minimizing the power
consumption of 3D ICs is presented. The proposed approach includes three major
headings: (1) 3D IC Voltage Assignment for power reduction with including three
factors--sensitivity, proximity effect and level shifter (LS) budget; (2) 3D
Electro-Thermal Analysis for getting the temperature distribution of 3D IC; (3)
Thermal Aware Static Timing Analysis for obtaining thermal-related delay values of
functional gates. The experimental results demonstrate the effectiveness of the
developed voltage assignment method in power reduction.
Keywords:3-D IC, Electro-Thermal Analysis, Power Optimization, Low Power
Design, Multiple Supply Voltage Design
三、研究計畫之背景及目的
In recent years, many researchers have shown that 3D ICs can provide the
powerful enhancement of system integration. However, the heat removal is a great
challenge in 3D IC design due to the high power density and the low thermal
conductivities of inter-layer dielectrics. Moreover, the high temperature induces
serious impacts on the timing, power and reliability of circuit design [1]. Therefore, it
is necessary to reduce the power consumption of circuits for mitigating the thermal
problem of circuit.
Among the existing power reduction techniques, the multiple supply voltage
(MSV) [2-5] is an effective technique to reduce dynamic power and leakage power.
Intuitively, any MSV techniques of 2D ICs can be extended to 3D ICs and the voltage
scaling can be performed tier by tier. However, without simultaneously considering
every tier in voltage assignment procedure might lead the power consumption
distribution of each layer to be off-balance and cause the thermal problem.
In this work, based on [6], a grid-based post-placement MSV method will be
developed for reducing the 3D IC power consumption and will be extended to explore
more possibilities. This developed technique will simultaneously consider the level
shifter (LS) budget and the thermal effect. Compared with previous works ignoring
the LS issue and using the thermal-unrelated models, the proposed approach is more
flexible and practical.
The report is organized as follows. First, the proposed power optimization
methodology is presented in section、四. After that, experimental results are given in
section、五. Finally, some conclusions are drawn in section、六.
四、
研究方法
3
The proposed post-placement MSV based method for power reduction during the
3D IC design flow is shown in Fig. 1. Given a known 3D IC design placement,
netlist, cell library and timing/leakage power cell library, for the grid-based
procedure, each tier is partitioned into n grids as illustrated in Fig. 2.(a). First, the
initial supply voltage of all gates is set to be the high supply voltage V
DDH, and
the initial temperature of chip is obtained by 3D Electro-Thermal Analysis
(section、四.A). Then, the Thermal Aware Static Timing Analysis (section、四.B)
is performed with the temperature-related delay of gate got from the initial
temperature in 3D Electro-Thermal Analysis step. After that, the Initial Voltage
Assignment (section、四.C) is executed, and the circuit timing might be violated
due to aggressively assigning low supply voltage V
DDLto all gates. Then, a
grid-based procedure is developed for the 3D IC Voltage Assignment (section、
四.D) that assigns an appropriate V
DDfor trading off the power consumption
penalty and the timing saving advantage. After executing the voltage assignment
procedure once, the power consumption and the delay of each gate are changed;
hence, the thermal and timing analysis should be done again to update the
temperature-related delay of gate and leakage power. 3D IC Voltage Assignment
is executed until no grid can be selected or timing violation is rescued (section、
四.E). The voltage assignment result got by the proposed method can be applied
to any voltage island generators. Moreover, 3D IC Voltage Assignment method
could be more beneficial for the voltage island generation because of considering
the proximity effect.
Fig. 2. A three-tier design example of the grid-based procedure for generating the
voltage assignment
A. 3D Electro-Thermal Analysis
Generally, the dynamic power is independent of temperature but the leakage
power is significantly affected by temperature. Based on the empirical
model [7], the gate leakage current I
gateis related to the oxide thickness, and
the subthreshold current I
subis related to the channel length and temperature.
Fixing the oxide thickness and the channel length, the subthreshold current
model of gate can be built by utilizing the least square fitting method to the
estimated results of HSPICE under the 90nm technology as follow.
(1)
(2)
Here, s
0and s
1are fitting constants, and T is the operating temperature of
gate/cell.
Since the fitting constants and the supply voltage are dependent, a look-up
table is set up to store them for different supply voltages, V
DDLand V
DDH.
With (1) and (2), the gate tunneling leakage power and the subthreshold
leakage power are
(3)
(4)
The 3D Electro-Thermal Analysis is performed with an electro-thermal
iterative updating loop which is built by integrating (3)-(4), the 3D IC
statistically thermal simulator [6] and the 3D IC analytical thermal
simulator [8].
B. Thermal Aware Static Timing Analysis
The thermal-aware static timing analysis (STA) is a conventional
block-based STA. Each gate delay is thermal dependent and is built as a
canonical first-order form by applying the least square fitting method to fit
the Monte Carlo results of HSPICE under 90nm technology. The general
delay expression form of a specific gate is
(5)
Here, a
0-a
4are V
DDdependent fitting constants, and a look-up table is
constructed to store these coefficients for V
DDHand V
DDL. C
Lis the load
capacitance.
C. Initial Voltage Assignment
Given a timing satisfied circuit with the initial supply voltage being V
DDH,
an Initial Voltage Assignment procedure is performed to save the most
power without considering the timing constraints. All gates are assigned to
operate at V
DDL’s, and this step might cause the timing violation. After the
5
each gate. With the updated operating temperature of each gate from 3D
Electro-Thermal Analysis and (5), the updated delay of each gate can be
obtained. Then, the updated arrival time (AT), the updated required arrival
time (RT) and the slack of each gate are calculated. Finally, a gate is
referred to as a site if the slack of that gate is found to be negative, and a set
of sites is output to the next step.
D. 3D IC Voltage Assignment
The algorithm of 3D IC Voltage Assignment is shown in Fig. 3. Given a set
of sites from Initial Voltage Assignment, the Grid-Based Procedure is
executed to decide which grid should be firstly picked, and then sites in this
chosen grid are assigned to operate at V
DDHto rescue timing. Then, Voltage
Re-Assignment is performed by selecting several sites in the selected grid to
operate at V
DDHfor timing rescue. The above procedure is repeat until none
of sites in the selected grid can be re-assigned to operate at V
DDH, or all sites
in this grid have been rescued successfully.
Fig. 3. Algorithm of 3D IC voltage assignment
E.1、 Grid-Based Procedure
We improve the idea of [6] to perform a grid-based procedure to decide
which grid is firstly picked, and then sites in this grid are assigned to
operate at V
DDHto meet timing constraints under the complex three
dimensional structure. The goal of this decision is to effectively trade off
the power consumption penalty and the timing saving advantage.
2. Firstly, the weight of each grid is determined by considering three factors:
1) sensitivity factor, 2) proximity factor, 3) LS budget factor. Then, the
three dimensional structure is vertically compressed into a two dimensional
planar illustrated in Fig. 2.(b), and the weight of each compressed grid is
obtained by accumulating the weight of each grid of z-axis. After that, the
compressed grid with the maximum accumulated weight is selected, and
this grid-based decision step is finished. Finally, as shown in Fig. 2.(c), the
selected compressed grid is restored back to the original three layer
structure. This procedure helps us to decide which grid is firstly assigned
for the next stage Voltage Re-Assignment. The weight of a grid i is defined
as
(6)
where
iis the sensitivity factor and LS budget factor, and
iis the
proximity factor. There are two main concerns for the definition of
i. The
first concern (S
site) is how to make an assignment decision can obtain the
most rescue of timing and the least penalty of power saving. The second
concern λ
iis how to make a assignment decision can lead to fewer LS
overheads (# level shifters). The
ifactor considers clustering. We hope
more gates in clusters in a single grid, and all gates in a cluster operate at
the same V
DD. The sensitivity factor
iand proximity factor
iare defined
as follows.
(7)
(8)
where
is the sum of site sensitivities S
site’s in grid i, N
iis the number
of all gates in grid i,
is the number of gates with V
DDH,
is the
number of gates with V
DDL, and λ
iis the LS budget factor that indicates
whether the estimated
is appropriate.
As the operating voltages of all sites in the selected grid are determined, the
needed number of level shifters is determined, and λ
ican be defined as
(9)
Here,
is the needed number of level shifters if all sites are assigned to
operate at V
DDHin grid i, and
is the available number of level shifters
in grid i that is estimated by the attainable white space of grid i.
Finally, the grid sensitivity
and the site sensitivity
are defined
as follows.
7
(10)
(11)
where
and
are
the delay and the power dissipation difference between V
DDLand V
DDH,
respectively;
is the slack of site j, and P
LSis the power of LS.
E.2、 Voltage Re-Assignment
Based on the decision of grid-based procedure, a Voltage Re-Assignment
procedure is performed to obtain the best timing rescue and the least power
saving penalty within the selected grid i. Two factors, sensitivity and LS
budget, are considered for selecting the site to operate at V
DDH. To start with,
the site with the maximum sensitivity is selected. Then, the number of usage
level shifters is checked if the site is assigned to operate at V
DDH. If the
number of usage level shifters is larger than the LS budget of the selected
grid, the selected site is not assigned to operate at V
DDH, and a new site with
the second largest sensitivity is selected until the site meets these two
constraints at the same time. After that, the selected site in grid i is assigned
to operate at V
DDHfor rescuing the timing. When the site is assigned to
operate at V
DDH, the timing and sensitivity information of gates are affected
and should be updated. The number of sites in grid
ican be reduced during
the assignment procedure. After the updating procedure, the next site with
the maximum sensitivity is selected, and the assignment step is executed
repeatedly. In this step, the selection and updating procedure are executed
repeatedly until no site in grid i, or all sites in grid i have been selected and
assigned. Therefore, the maximum possible times for executing the
re-assignment procedure are equal to the initial number of sites in grid i, and
the computation load of updating sensitivity is reduced.
E. Timing and LS Budget Rescue
E.1、 Timing Rescue
The 3D IC Voltage Assignment result might lead the circuit timing to be
violated because the LS budget is limited. Hence, we can not arbitrarily
assign V
DDHto the site. Therefore, a timing rescue is executed to rescue the
timing of circuit. For example, the gates with red color are sites and the
gates with blue color are operating at V
DDLas shown in the circuit schematic
of Fig. 4.(a). First, we compute the gain of each site and sort them. The gain
is the delay difference between V
DDHand V
DDL. The site with the maximum
gain is selected firstly and assigned to operate at V
DDH. Then, the fan-in
gates of the site are selected. For example, we firstly select the fan-in gate C
with the maximum AT to operate at V
DDH, and update its timing information.
If the maximum AT is changed from gate C to gate D after gate C is
assigned to operate at V
DDH, we must assign V
DDHto gate D again. In this
way, all gates of fan-in of the site are checked until the dominant gate is
found. Although the timing is rescued, the number of level shifters might be
over the LS budget due to executing the timing rescue procedure without
considering LS budget. Therefore, we have to sacrifice the power
consumption by assigning more gates with V
DDLto be V
DDHfor reducing the
number of level shifters. In the following, we are going to perform our
method for rescuing the level shifter usage to meet the LS budget.
E.2、 LS Budget Rescue
Intuitively, we can start with the fan-in gate of the LS, and assign V
DDHto it.
This way not only reduces the usage of level shifters but also maintains the
correctness of circuit. In Fig. 4.(a), the gates with yellow color are level
shifters, the red color gates operate at V
DDH, and the blue color gates operate
at V
DDL. The LS budget rescue step finds out all level shifters, and checks
whether their fan-in gates with V
DDLcan be changed to V
DDHfor reducing
the number of level shifters. For example, Fig. 4.(a) shows that gate C is the
fan-in of LS (No. 1 and No. 2), and gate D is the fan-in of LS (No. 3). The
gain of gate C is equal to -2+1=-1, and the gain of gate D is equal to
-1+2=1. Obviously, gate C reduces one LS but gate D increases one LS.
Therefore, we should select gate C to operate at V
DDHfor reducing the usage
of level shifters as shown in Fig. 4.(b). Similarly, the gains of gate A and D
in Fig. 4.(b) are checked again. The rescue procedure stops until no gate can
be assigned to operate at V
DDHfor reducing the usage of level shifters.
9
五、實驗方法與結果
We implement our proposed method in C++ language and apply the algorithm to
a set of ISCAS89 benchmark circuits and private designs. First, Design Compiler
is used to synthesize the benchmark circuits with the UMC 90nm standard cell
library. Next, the initial 2D placement of each test circuit is generated by the
SOC Encounter, and its related 3D placement is obtained by transforming the 2D
placement with Z—Place [9]. The timing/leakage power cell library with
temperature effect is generated by evaluating the average leakage current and the
delay of gate based on H-SPICE simulation for various types of logic gates.
After getting the H-SPICE simulation results, based on the least square fitting
method, the fitting constants of the leakage current and delay models are
obtained. For simplicity, a single type of LS is used in the experimental results.
A. Comparison of Voltage Assignment Results
TABLE I lists the results of voltage assignment in different phases. The
number of sites after Initial Voltage Assignment is listed in column 4.
Columns 5-6 show the number of sites and the usage of level shifters after
3D IC Voltage Assignment. The results of Timing Rescue and LS Budget
Rescue are listed in columns 7-8 and column 9, respectively. Finally, the
summarization is listed in columns 10-11.
First, the results of 3D IC Voltage Assignment show that the timing rescue
after initial voltage assignment is limited significantly by the LS budget.
Most circuits cannot meet timing constraints after 3D IC Voltage
Assignment procedure under the limited LS budget constraint. Next, to deal
with this problem, we try to rescue the sites of circuit by Timing Rescue.
After executing Timing Rescue procedure, all circuits meet timing
constraints finally. However, the usage of level shifters is more than LS
budget because the timing rescue procedure only considers the influence of
voltage assignment on timing and does not take LS budget into
consideration. Finally, we try to rescue the circuit again by LS Budget
Rescue. The results of column 11 show that it still has three circuits that are
rescued unsuccessfully.
For reducing the problem size, 3D IC Voltage Assignment method limits the
voltage assignment decision to the sites, and obviously the number of sites
should not be large in a circuit. Moreover, we think that the site is the most
important gate for rescuing the timing. However, based on sites, it is not
enough for the timing rescue because of the LS budget constraint.
TABLE I. Results of the Proposed Voltage Assignment Method
B. Power Reduction
TABLE II summarizes the optimization results. TABLE I shows that three
circuits are failed, and six circuits are successful after executing the
proposed power reduction method. Therefore, the average of improvement
is the average of the six circuit results. The initial power and average
temperature of the circuit are listed in columns 2-4, and the power and
average temperature of the circuits and the power of level shifters after
optimization are listed in columns 5-8. The improvement percentages of the
dynamic power, the leakage power and the total power are listed in columns
9-11, respectively. The temperature decrement of each circuit is listed in
column 11. The improvement columns indicate that the proposed 3D IC
Voltage Assignment method can provide almost 33.50% total power saving
and 26.34 degree decrement of temperature in average. It can be observed
that the leakage power reduction is greatly improved because of the
temperature decrement.
TABLE III shows the leakage power and temperature estimation with the
simulated temperature in columns 2-3 and without the simulated
temperature in columns 4-5, and the percentage differences of leakage
power in column 6. As shown in TABLE III, the full chip leakage power
analysis without accurate temperature can lead to 53.88% error in average.
If the leakage power is underestimated, the power reduction can be
dominated by the dynamic power, which is quite impractical.
11
TABLE III. Leakage Power Estimation
六、
結論與討論
In this work, a 3D IC Voltage Assignment method with the combination of
selecting grid by Grid-Based Procedure and Voltage Re-Assignment is proposed to
minimize the total power consumption of 3D IC design. By employing the
temperature-related gate delay and leakage power models, the more accurate
estimation of circuit performance can be obtained. Although it has three unsuccessful
circuits, the experimental results have shown a great power reduction by the proposed
method.
七、成果
[1] Huai-Chung Chang, Pei-Yu Huang, Ting-Jung Li, and Yu-Min Lee,
“Statistical Electro-Thermal Analysis with High Compatibility of Leakage
Power Models,” International SoC Conference (SOCC), 2010.
[2] Shu-Han Whi and Yu-Min Lee, “Dual Supply Voltage Assignment in 3D ICs
Considering Thermal Effects,” The 16th Workshop on Synthesis And
System Integration of Mixed Information Technologies (SASIMI), 2010.
[3] Yu-Min Lee and Chi-Wen Pan, “Redundant Via Insertion with Wire
Spreading Capability,” International Journal of Electrical Engineering
(IJEE), vol. 17, no. 6, pp. 383-398, December 2010.
[4] Chi-Wen Pan, Yu-Min Lee and Chih-Sheng Wang, “Redundant Via
Insertion under Timing Constraints,” International Symposium on Quality
Electronic Design (ISQED), 2011.
[5] Shu-Han Whi and Yu-Min Lee, “Supply Voltage Assignment for Power
Reduction in 3D ICs Considering Thermal Effect and Level Shifter Budget,”
International Symposium on VLSI Design, Automation and Test
(VLSI-DAT), 2011.
[6] Pei-Yu Huang and Yu-Min Lee, “Statistical Hot-Spot Identification Using
On-Chip Thermal Yield Profile,” VLSI Design/CAD Symposium
(VLSI/CAD), 2011.
[7] Pei-Yu Huang and Yu-Min Lee, “On-Chip Statistical Hot-Spot Estimation
Using Mixed-Mesh Statistical Polynomial Expression Generating and
Skew-Normal Based Moment Matching Techniques,” Accepted by Asia
South Pacific Design Automation Conference (ASPDAC), 2012.
八、
參考文獻
[1] V. Reddy and A. T. Krishnan. Impact of negative bias temperature
instability on digital circuit reliability. Proceedings of IRPS, pages 248-254,
2002.
[2] S. H. Kulkarni and A. N. Srivastava. A new algorithm for improved VDD
assignment in low power dual VDD systems. Proceedings of ISLPED,
pages 200-205, 2004.
[3] H. Wu and I. M. Liu. Post-placement voltage island generation under
performance requirement. Proceedings of ICCAD, pages 309-316, 2005.
[4] R. L. S. Ching and E. F. Y. Young, E.F.Y. Post-placement voltage island
generation. Proceeding of ICCAD, pages 641-646, 2006.
[5] H. Wu and M. D. F. Wong. Timing-constrained and voltage-island-aware
voltage assignment. Proceedings of DAC, pages 432, 2006.
[6] S. A. Yu and P. Y. Huang and Y. M. Lee. A multiple supply voltage based
power reduction method in 3-D ICs considering process variations and
thermal effects. Proceedings of ASPDAC, pages 55-60, 2009.
[7] H. F. Dadgour and S. C. Lin. A statistical framework for estimation of
full-chip leakage-power distribution under parameter variations. IEEE
Transactions on Electron Devices, 54(11):2930-2945, 2007.
[8] P. Y. Huang and Y. M. Lee. Full-chip thermal analysis for the early design
stage via generalized integral transforms. IEEE Transactions on Very Large
Scale Integration Systems, 17(4):613-626, 2009.
[9] R. Hentschke and G. Flach. 3D-vias aware quadratic placement for 3D
VLSI circuits. Proceedings of ISVLSI, pages 67-72, 2007.
表 Y04
行政院國家科學委員會補助國內專家學者出席國際學術會議報告
99 年 10 月 14 日
報告人姓名
李亭蓉
服務機構
及職稱
國立交通大學電信工程系(所)
碩士
時間
會議
地點
99.9.27-99.9.29
美國、內華達州
(Nevada, US)
本會核定
補助文號
NSC 98-2220-E-009 -058 -
NSC 99-2220-E-009 -035 -
會議
名稱
(中文)第 23 屆國際系統晶片會議
(英文) 23
rdIEEE International SoC Conference (SOCC 2010)
發表
論文
題目
(中文) 對功率模型具有高度相容性的統計型電熱分析
(英文) Statistical Electro-Thermal Analysis with High Compatibility of
Leakage Power Models
報告內容應包括下列各項:
一、 參加會議經過
本次會議除了口頭論文發表、海報論文發表之外;主辦單位並且邀請不同領域的專家
針對不同的系統晶片設計考量方向給與前瞻的演講。整個會議中聽取了功率考量專家剖
析現在及未來能量的降低之於系統晶片、微處理器及計算系統的發展趨勢,並與數個發
表論文的作者及相關業界的學者專家討論研究議題內容。
此研討會一直以來都是受研究單位矚目的國際會議之一,吾人非常榮幸有機會參加此
次2010年在美國內華達州(Nevada, U.S.A)舉行的會議。會議分為口頭發表及壁報發表兩
部分,此國際會議中發表之文章不只範圍廣且技術先進,因此參加此會議不僅可增進自
己研究領域之知識,亦可了解現今系統晶片的趨勢。
我們在此次會議發表論文為對功率模型具有高度相容性的統計型電熱分析。論文全文
請參見附件。
二、 與會心得
參加本屆國際系統晶片會議,令我獲益良多。不僅吸收到眾多設計系統晶片上考量的
方針與趨勢;如,為節省能源消耗及手持(handheld)便利性綠系統("Green" System)。經
由口頭論文發表會,接觸到不同子領域中解決相似問題的演算法,並且獲得其他專家們
所提出的改進方向與建議。
此次會議中主要有幾部分:系統晶片的能源最佳化技術及電路、類比電路、系統設計
方法、通訊電路系統和嵌入式記憶體系統。
三、 建議
與會人士除了歐美國家外,韓國、日本、及中國都是積極參與國際的會議,如果往
後能多鼓勵參予類似活動,對於國際交流與合作上會有很大的幫助,也可以藉由接觸
國外學者獲得更廣的國際觀,增加研究的能力。
四、 攜回資料名稱及內容
會議論文海報集光碟:集合發表於此研討會議中所有論文及海報內容。
會議手冊:所有演講、論文及海報的摘要還有會議議程。
附
件
三
國科會補助專題研究計畫項下出席國際學術會議心得報告
日期:100年 3 月20日
一、參加會議經過
本次 2011 年 ISQED 的舉辦地點為美國的聖克拉拉,我們所發表的論文:
Redundant Via Insertion under Timing Constraints,很榮幸在 ISQED 的論文甄選中,成
為被選中的論文之一。在這三天的會議中,將近有 14 個 workshops 在這段時間內進
行,也因為時間緊迫,很多 workshop 都是同時進行。因此,在這段時間內可以自由
選擇自己有興趣的題目去參加會議,瞭解到現在電子設計自動化上的發展趨勢以及
一些相關應用成果,同時也可以知道其他學者所發表的論文成果。
計畫編號
NSC99-2220-E-009-035
計畫名稱
針對 3D 整合之電子設計自動化技術開發-子計畫一:三維度積體電路
的隨機電熱模擬及其對功率最佳化的應用(2/2)
出國人員
姓名
潘麒文
服務機構
及職稱
國立交通大學電信工程研究所 SOC 組
博士班三年級
會議時間
100 年 3 月 14 日至
100 年 3 月 16 日
會議地點
美國 聖克拉拉(
Santa Clara
)
會議名稱
(中文) 電子設計品質會議
(英文)11th International Symposium on Quality Electronic Design
發表論文
題目
(中文)在時序限制下的冗餘接點安插
些同時進行的 session,不過像是 demo 和 poster 這類的 session,就可以先去 demo
的會場看每一個的主題,在那邊聽作者的解釋、說明、示範,也有人提供實機讓你
借出會場去測試使用。大致看完 demo 後,可以再去外面看 poster,幾乎每個海報都
會有作者在旁邊解說。雖然這兩場是同時進行,可是時間安排剛剛好,不會讓人有
一下子就看完的感覺。另外像是 Best papers 的 session,雖然他是在一個很大的會議
廳,可是參加人數也非常的多,太晚到甚至有可能找不到座位。
本人是在會議的最後一天上台報告,當天有 4 個 session 同時進行,因此每間會
議室的人都不像之前那麼多。此外,在會議進行時有發生一點小意外,就是其中一
位演講者的投影片數據變成亂碼,當時該位教授還開玩笑說是因為水土不服嗎,也
讓人感受到隨機應變的重要性,也很欽佩演講者沒有被投影片失常而影響後面的報
告。在同一場 session 中碰到幾個同樣來自交大的學生,也算是一個很特別的經驗。
二、與會心得
這次出國參加 ISQED 2011,是本人第一次參加大型的國際研討會,也是第一次
去美國,很高興也很榮幸有這個機會可以去參加研討會。在研討會的中途休息時間
可以看到有部份學者、研究人員在討論剛剛聽得演講內容,不禁讓人覺得佩服,也
期許自己可以像他們一樣。
由於是第一次參加大型研討會,很多事情都不知道該怎麼做,幸好同行者有這
方面的經驗,很多事情都請他幫忙處理,在這過程中也瞭解到事先多做一點功課,
之後到美國那邊就比較輕鬆了。美國地大物博,在聖克拉拉如果沒有交通工具,真
的是行動不便,好加在還有一些鐵路系統,可以帶我們到比較繁榮的市區走走‧在
飯店裡,可以看到來自不同國家的人,我發現歐美的人都比較熱情,會主動向我們
打招呼,而華人通常都是一群一群的自成一個團體‧
在參與別的 session 時,會去另外注意別人製作簡報的方式,研究如何使用簡
單明瞭的方法讓聽眾很快的吸收、理解。也因為這樣,一直到上台報告的前幾天,
我還是一直在調整自己的簡報,使用一些動畫的方式來強調重點,希望讓聽眾能夠
較易瞭解。可惜的是有部份 session 是同時進行,無法聽完所有的演講,因此只能
從演講主題下去挑選要參加的研討會。
住在旅館時,旅館有提供無線網路的服務,可是在房間內使用常常會覺得訊號
差以及連線不穩定,無法從網路查資料,這也突顯了事先準備的重要性‧這趟行程
讓我見識到大型國際研討會的規模與水準,以及來自各地的研究人員,深感國際競
爭的壓力,我們必須各加倍專注於我們的學術研究‧
三、考察參觀活動(無是項活動者略)
四、建議
可以在國內多舉辦類似的大型國際研討會,並邀請一些國會知名學者出席演
講,以提升會議的規模與水準,與世界接軌‧
五、攜回資料名稱及內容
11
thISQED 光碟x1 : 內含本次會議的所有論文資料
六、其他
--- Forwarded Message ---
From: [email protected]
To: [email protected]
Cc: [email protected]
Sent: Sat, 26 Jun 2010 06:14:30 -0700
Subject: Your SOCC 2010 Submission (Number 68)
Dear Prof. Yu-Min Lee:
On behalf of the SOCC 2010 Program Committee, we are pleased to inform you that
the following submission has been accepted to appear at the conference as a regular
paper:
Statistical Electro-Thermal Analysis with High
Compatibility of Leakage Power Models
Please revise your paper according to the the reviews. Your final manuscript will
appear in the proceedings. The manuscript is limited to SIX pages. The deadline
for submission is Friday July 9, 2010.
To upload your final manuscript, please visit the following
site:
https://www.softconf.com/b/socc2010/
and, on the left-hand side of the page, enter the passcode associated with your
submission. Your passcode is as follows:
68X-F8B3B5H7B5
Alternatively, you can click on the following URL, which will take you directly to a
form to submit your final paper:
https://www.softconf.com/b/socc2010/cgi-bin/scmd.cgi
?
scmd=aLogin&passcode=68X-F8B3B5H7B5
The reviews and comments are attached below. Please try to follow the reviewers'
advice when you revise your paper.
Congratulations on your fine work. If you have any additional questions, please feel
free to get in touch.
Best Regards,
--- 原文 ---
主旨
: Your ISQED 2011 Submission (Number 238)
寄件者
: [email protected]
日期
: 四, 十一月 25, 2010 7:01 am
收件者
: [email protected]
---
Dear Mr. Chi-Wen Pan:
On behalf of the ISQED 2011 Program Committee, I am delighted to inform you that
the following submission has been accepted to appear at the conference:
Redundant Via Insertion under Timing Constraint
The Program Committee worked very hard to thoroughly review all the submitted
papers. Please repay their efforts, by following their suggestions when you revise
your paper.
To upload your final manuscript, please visit the following
site:
https://www.softconf.com/b/isqed2011/
and, on the left-hand side of the page, enter the passcode associated with your
submission. Your passcode is as follows:
238X-F5G3P6H9C5
Alternatively, you can click on the following URL, which will take you directly to a
form to submit your final paper:
https://www.softconf.com/b/isqed2011/cgi-bin/scmd.cgi?scmd=aLogin&passc
ode=238X-F5G3P6H9C5
The reviews and comments are attached below. Again, try to follow their advice
when you revise and improve the quality of your paper.
free to get in touch.
Best Regards,
Kamesh Gadepally - ISQED2011 TPC Chair
Keith Bowman , ISQED2011 TPC Co-Chair
ISQED 2011
STATISTICAL ELECTRO-THERMAL ANALYSIS WITH HIGH COMPATIBILITY OF
LEAKAGE POWER MODELS
Huai-Chung Chang, Pei-Yu Huang, Ting-Jung Li and Yu-Min Lee
National Chiao Tung University, Hsinchu, Taiwan
Abstract— In this work, a statistical electro-thermal analyzer with high compatibility of power model is developed. The developed analyzer takes both the easily implementing advantage of Monte Carlo method and the fast convergent advantage of stochastic analysis method to effectively solve the statistical electro-thermal problem. Experimental results indicate that the developed electro-thermal analyzer can be orders of magnitude faster than the Monte Carlo method under the same accuracy level. The computational time is only1.16 seconds for a design with over one million gates, and the maximum errors are only0.34% and 1.84%, compared with the Monte Carlo method, for estimating the mean and the standard deviation profiles of full-chip temperature distribution, respectively.
I. INTRODUCTION
Power dissipation and thermal effect are important issues of VLSI design as the technology continuously scales down, and the power density rapidly increases. The chip-temperature profiles and gradients significantly influence on IC performance, reliabil-ity, and package cost. Because leakage power contributes a large portion of total power in the modern technology, it is necessary to model and estimate leakage power accurately. Furthermore, the leakage power of a circuit element exponentially depends on its operating temperature and process parameters. Hence, process variations and thermal impacts need to be cautiously considered. In recent years, several thermal-power related analysis methods have been proposed. In the power analysis, [1]–[3] quantified process variations of leakage power. Nevertheless, none of them simultaneously considers the statistical power and the electro-thermal effect. In the electro-thermal analysis, [4] proposed a deter-ministic electro-thermal analyzer considering the temperature dependence of leakage power.
To include process variation effects, the electro-thermal simu-lation needs to be considered as a statistical fashion to ensure design reliability. Hence, several statistical thermal analyzers were developed [5], [6]. However, [5] didn’t consider thermal coupling. Though [6] presented a statistical electro-thermal analysis, it needs to re-fit the leakage power model as the design or its geometry changes because its model fits the leakage power of each temperature grid rather than that of each gate. This limits its usage for early physical design stages. Moreover, both of them need specified leakage power models for the power projection [5] and the iteratively log-normal approximation [6].
Because the scaling technology can lead more complicated leakage power models for enhancing the accuracy, it is urgent to develop a statistical thermal analyzer with the high capability of accurate but complicated leakage power models. Compared with [5], [6], our developed statistical electro-thermal analyzer is more applicable since we take the advantage of sparse grid collo-cation technique [7] to avoid the convoluted statistical calculation algorithm. The sparse grid collocation technique has been adopted in thermal-power related researches such as building leakage
power models [3] and analyzing statistical leakage power [2]. However, both of [2], [3] didn’t consider and indicate how to treat temperature dependence issues in their power analysis methods. In this work, we will present how to easily, accurately and efficiently solve the statistical electro-thermal problem with any temperature-dependent leakage power models. Moreover, rather than [6], the developed electro-thermal analyzer doesn’t need to re-fit leakage power models during thermal-driven early physical design stages such as floorplanning or placement because the cell based leakage power models are adopted. Firstly, the Karhunen-Lo`eve (KL) expansion is used to transform spatially correlated physical parameters to a set of uncorrelated random variables. Then, the Smolyak sparse grid formulation [7] is applied to obtain the sampling values of physical parameters for obtaining the deterministic power models in executing deterministic electro-thermal simulations. After a set of deterministic electro-electro-thermal simulations being solved, the Newton interpolating formula is utilized to calculate the expression coefficients of temperature profile. Finally, the statistical characteristics of temperature dis-tribution can be extracted.
Our major contributions are
1) This work presents an easily, accurately and efficiently statistical electro-thermal simulation, and it has the high compatibility to incorporate any power models.
2) The developed statistical electro-thermal analyzer can ac-curately and efficiently provide the mean and standard deviation profiles of full-chip temperature distribution. 3) Experimental results reveal that ignoring electro-thermal
coupling in statistical thermal analysis can lead to signifi-cant errors of full-chip temperature distribution.
This paper is organized as follows. Firstly, the leakage power modeling and the problem formulation are described in section II. After that, the proposed statistical electro-thermal analyzer is de-tailed in section III. Finally, experimental results and conclusion are given in sections IV and V, respectively.
II. LEAKAGEPOWERMODELING ANDPROBLEM
FORMULATION
A. Leakage Power Modeling
Many leakage power models were developed in [1], [2], [5], [6], [8]. However, none of them in [1], [2], [5] simultaneously considered temperature and process variation effects. Hence, their accuracy degrades as the technology scales down. For the authors’ best knowledge, only [6], [8] simultaneously considered both effects. Nevertheless, the leakage current model in [8] was based on 90nm technology. Hence, as the technology advances, its accuracy deteriorates shown in TABLE I. A grid-based leakage power model was developed in [6]. Each fitted model was used to coarsely approximate the total leakage current in each grid, and this limits its usage after the floorplanning stage.
139 978-1-4244-6683-2/10/$26.00 ©2010 IEEE
TABLE I
ERROR COMPARISON OFIsANDIgWITH THE RESULTS OFHSPICEUNDER65nmTECHNOLOGY FOR ANNANDGATE.
fg Max Error Avg. Error Error> 3% Without temperature Tox, Tox2, Lch, L2ch[1] 6.48% 2.70% 4.37%
With temperature Our adopted model: a polynomial function constructed by Lch, T , Toxand Tox2 1.55% 0.29% 0.00%
fs Max Error Avg. Error Error> 3%
Without Lch, L2ch, Tox−1, Tox2 [1] 347.32% 70.65% 98.27%
temperature Lch, L2ch, Tox−1, Tox, Tox2, Tox/Lch, Lch/Tox, ToxLch [2] 314.13% 70.52% 100.00%
With Lch, T, Tox[8] 32.23% 8.73% 76.62%
temperature Our adopted model: a 3rdorder polynomial function completely expanded by L
ch, Toxand T 1.31% 0.19% 0.00% Here, a cell-based leakage power fitting model including the
process variation effect and temperature dependence is presented. Firstly, for each cell, different input patterns, various physical pro-cess parameter values and operating temperatures are combined and put into HSPICE with industrial design kit under the BSIM4 model to generate its leakage current data. After that, the average leakage currents of input patterns are fitted by the least square fitting method. Finally, the fitted coefficients of different average leakage current models such as the average subthreshold leakage (Is) and the average gate tunneling leakage (Ig) can be obtained.
Since Isis the off-state leakage, and Ig occurs in both on and
off states of transistor, the cell leakage power can be written as Pleakage = Vdd× (Ig+ (1 − Sw) Is) , (1)
where
Ig = a0· efg(Tox,Lch,T ), (2)
Is = b0· efs(Tox,Lch,T ). (3)
Here, a0and b0are fitted constants, Lch and Toxare the channel
length and oxide thickness, respectively. T is the operating temperature, Sw is the switching activity, Vdd is the supply
voltage, and fg(Tox, Lch, T ) and fs(Tox, Lch, T ) are specific
fitting forms1. I
g is modeled as exponentially dependent on
temperature since it is exponentially affected by the threshold voltage [9].
The accuracy of several existing leakage current models has been investigated [1], [2], [8]. Because they do not present enough accuracy, much more accurate leakage current models are proposed in this work. The error comparison of existing leakage current models and the proposed leakage current models for an two-input NAND gate is shown in TABLE I.
As shown in TABLE I, different fg and fs lead to different
errors compared with the results of HSPICE. The drastic errors of [1], [2], [8] are because of the ignorance of either temperature or developing technology. The maximum error and average error of proposed models are less than 1.55% and 0.29%, respectively. Actually, for all cell types given by an industrial design kit, the maximum error and average error of our developed leakage current models are only 1.55% and 0.5%, respectively.
With the above demonstration, the leakage current (power) model might be very complicated for achieving acceptable ac-curacy. This fact indicates that the statistical power analyzer or the statistical thermal analyzer should have the ability to handle complicated leakage current (power) models.
1We consider the variations of the device channel length and the oxide thickness since the leakage power is very sensitive to them [1]. It should be noted that although only these two parameters are considered, our framework can be easily extended to include the effects of any process variation parameters such as the channel dopant variation, etc.
Fig. 1. Compact thermal model of physical design.
B. Problem Formulation
The compact thermal model of a chip for the physical design stage is shown in Fig. 1 [10]. The primary heat flow path is composed of thermal interface material, heat spreader and heat sink. The secondary heat flow path involves interconnect layers, I/O pads, and the print circuit board. The functional blocks are modeled as many power sources attached to the thin layer close to the top surface of die. The main heat sources consist of the dynamic and leakage power consumed by devices. Because the dynamic power is insensitive to process variations and operating temperature [1], it is viewed to be deterministic. However, the leakage power is strongly dependent on process parameters and operating temperature. Hence, the leakage power is viewed as random processes [1], and the thermal coupling effect needs to be considered for the full-chip temperature distribution analysis. By combining the compact thermal model and the statistical power consumption considering the thermal coupling effect, the steady state temperature distribution T (r, θ, ) of die is determined by the following statistical steady-state heat transfer equation. ∇·(κ(r, T )∇T (r, θ, ))=−p(r, Lch(x, y, θ), Tox(x, y, ), T ), (4)
subject to the following boundary condition κ(rbs, T )
∂T (rbs, θ, ) ∂nbs
+ hbsT (rbs, θ, ) = fbs(rbs). (5) Here, ∇ is the diverge operator, and κ(r, T ) is the thermal conductivity of die. The p(r, Lch(x, y, θ), Tox(x, y, ), T )
is the random process of power density profile which consists of the dynamic power density profile pd(r), the sub-threshold leakage power density profile
ps(r, Lch(x, y, θ), Tox(x, y, ), T ), and the gate leakage
power density profile pg(r, Lch(x, y, θ), Tox(x, y, ), T ). The
r = (x, y, z) ∈ D, D = (0, Lx) × (0, Ly) × (−Lz, 0) is the
domain of die, Lx and Ly are lateral sizes of die, and Lz
is the thickness of die. The θ and are sampling values of manufacturing outcomes ΩLch and ΩTox for the channel length and oxide thickness, respectively. The Lch(x, y, θ) and
Fig. 2. The flowchart of proposed statistical electro-thermal analyzer.
length and the oxide thickness, respectively. The bs is any
specific boundary surface of the die, and rbs is the position
located on bs. The hbs is the heat-transfer coefficient on bs,
fbs(rbs) is the heat flux on bs, and ∂/∂nbs is the differential
operator along the outward direction normal to bs. Since the
major part of device current passes through the region close to the channel, the power density profile has its value only in that region which its thickness is equal to the junction depth for dynamic and sub-threshold leakage power and is equal to the Debye length for gate tunneling leakage power.
With equations (4)–(5), our goal is to evaluate the mean and variance profiles of steady-state full-chip temperature distribution.
III. PROPOSEDSTATISTICALELECTRO-THERMALANALYZER
The flowchart of proposed statistical electro-thermal analyzer is shown in Fig. 2. Each operation in Phase 1 is only related with the technology node rather than design pattern, and operations of
Phase 2 are design dependent.
In Phase 1, given a spatial covariance function of physical pa-rameters, the KL expansion is employed to decompose correlated parameters into a set of uncorrelated random variables. After that, the Smolyak sparse grid formula is used to generate sparse grids, which are a set of sampling random vectors of KL expanded and inter-die random variables. In Phase 2, with each sampled random vector on sparse grids, a deterministic electro-thermal simulation with the deterministic power profile obtained by this sampled random vector is performed. Then, with all thermal profiles corresponding to sampled random vectors on sparse grids, an approximated representation of stochastic full-chip temperature distribution is obtained by the Newton interpolating formula. Finally, the statistical characteristics, such as mean and variance profiles, of the full-chip temperature distribution are extracted.
Different from the existing statistical thermal/electro-thermal analyzers [5], [6], the proposed framework can easily, accurately and efficiently obtain an approximated expression of the full-chip temperature distribution without suffering from complicated statistically calculating algorithms such as the power projection [5] and the iteratively statistical temperature moment extraction [6]. This is because each power profile corresponding to each sampled random vector is deterministic during each determin-istic electro-thermal analysis being performed. Hence, accurate but complicated leakage power models can be adopted in this framework. Each step in Fig. 2 is detailed in the rest subsections.
A. Parameter Transformation
Generally, process variations of one physical parameter P
can be classified into intra-die Pintra and inter-die Pinter
variations which both can be modeled as Gaussian random
variables [1]. The physical parameter P ∈ {Tox, Lch} with its
expected value P at position rxy= (x, y) can be written as
Tox(rxy, ) = Tox(rxy)+ΔToxintra(rxy, i)+ΔToxinter(rxy, j) , (6)
Lch(rxy, θ) = Lch(rxy)+ΔLintrach (rxy, θi)+ΔLinterch (rxy, θj) . (7)
The i and j are subsets of , and θi and θj are subsets of
θ.
According to [1], Tox(rxy, ) = Tox(x, y, ) is assumed
to be spatially uncorrelated2. Because the spatial correlation of
ΔLintra
ch (rxy, θi) might have different decreasing rates in x- and
y-directions, the spatial covariance function proposed in [11] is
adopted for ΔLintra
ch (rxy, θi)3. Given σ as the standard deviation
of target random process, and correlation lengths ηx and ηy in
x- and y-directions, respectively, the spatial covariance function
between two random variables at points rx1y1 and rx2y2 is
C(rx1y1, rx2y2) = σ2e−
|x1−x2|
ηx e−|y1−y2|ηy . (8)
With applying the KL expansion, ΔLintra
ch (rxy, θi) based on (8) can be approximated as ΔLintra ch (rxy, θi) ≈ NLch m=1 √χ mqm(rxy)ζm(θi). (9)
Here, χm’s are eigenvalues of C(rx1y1, rx2y2), qm’s are related
eigenvectors, and NLch is the expansion length. {ζm(θi)} is the
set of uncorrelated standard normal random variables.
Because of the KL expansion property, the expanded random variables are Gaussian random variables if the target random
process is Gaussian, and the eigen-pair (χm, qm(rxy)) closed
form can be derived [12]. In the rest of this paper, ζ = {ζm}
and ς = {ςn} are sets of random variables to represent Lch and
Tox, respectively, ˜ξ = ζ ∪ ς, and θ and are dropped for the
sake of notation simplicity.
B. Smolyak Sparse Grid Formulation
The basic idea of Smolyak sparse grid formulation is to build an interpolating approximation of a high dimensional
multivariate-function u ∈ Cr by much less sampling values
of the desired function than the full tensor product interpola-tion formula but with an acceptable error bound in the order
of O(M−rlog M(d−1)(r−1)) [13]. Here, M is the number of
sampling points, and d is the number of variables.
For the Monte Carlo method, the random variable samples are randomly generated, and a large number of samples is re-quired to achieve accurate mean and variance estimation. For the Smolyak sparse grid formulation, the random variable samples are generated by using roots of Hermite polynomial chaos (H-PCs) or extrema of the Chebyshev polynomial [14], and the desired solution is obtained by using interpolation with these samples.
2Although Toxis assumed to be spatially uncorrelated, the proposed simulation
mechanism still works for Toxbeing spatially correlated.
3Although we choose this specific spatial covariance function (8), any valid
spatial covariance functions can be adopted
The high order interpolating approximation can be achieved with a small number of samples [13].
According to the Smolyak sparse grid formulation [7], our desired full-chip statistical temperature distribution T (r, ˜ξ) rep-resented by a set of KL expanded random variables ˜ξ can be explicitly approximated as [15] ˜ Td q(r, ˜ξ) = q−d+1≤|i|≤q (−1)q−|i|d − 1 q − |i| (Qi1(T ) ⊗· · ·⊗ Qid(T )), (10)
where d is the number of random variables in ˜ξ, q is the level of desired solution, Qin with the level in ≥ 1 is the one-dimensional interpolating operator of T (r, ˜ξ) with respect to the n-th random variable in ˜ξ, ⊗ is the functional cross product, and |i| = i1 + · · · + in + · · · + id. The level in is the index
to decide the number of sampling values for the interpolating polynomial Qin. As suggested in [16], the relation between the number of sampling values min and the level in is m1= 1 and mijn= 2in−1+ 1 for in > 1.
From (10), we only need to know the temperature on the following small set of sampling values for ˜ξ [17]. The sparse grid, the set of sampling values of ˜ξ, in (10) is derived as
H (q, d) =
q−d+1≤|i|≤q
ϑi1× · · · × ϑin× · · · × ϑid, (11)
where ϑin denotes the set of sampling points of ˜ξn, and ‘×’ is the cross product of the points of set.
The number of sampling points from Smolyak sparse grid formulation increases as O( dq−d
(q−d)!) that is less severe than that
of full tensor product formulation. The runtime complexity of our proposed statistical electro-thermal analyzer can be analyzed to be O(Cdet(q−d)!dq−d ). The Cdet is the runtime complexity for
executing a deterministic electro-thermal simulation.
The sampling values corresponding to ϑin must be properly decided. Adopting the roots of H-PCs with its order being corresponding to the level incan achieve the most accurate result
if ˜ξ is a set of normal random variables [14]. On the other hand, adopting the extrema of the Chebyshev polynomial with its order being corresponding to the level incan achieve the nested sparse
grid structure for any levels and acceptable accuracy [16]. In this paper, we adopt the roots of H-PCs in our experimental implementation because the results are shown to be very accurate by using the low level approximation, and the nested sparse grid structure is still preserved for q = d + 14.
C. Calculation of Temperature Profiles on Sparse Grids After the sparse grid H(q, d) being obtained, the samples of channel length and oxide thickness corresponding to the m-th sampling grid ˜ξm of H(q, d) can be obtained by using the
parameter modeling technique stated in section III-A. Hence, the deterministic power density profile corresponds to ˜ξm can be
obtained. With the deterministic power density profile, we have the following deterministic steady heat transfer equation.
∇ ·κ(r, T )∇T (r, ˜ξm)= −p(r, ˜ξm, T ), (12)
4If a highly order approximation is required for the accuracy, we suggest the extrema of the Chebyshev polynomial because its nested sparse grid structure is preserved for any levels; hence, it needs much less sampling points than that of choosing the roots of the H-PCs for highly order approximation.
Algorithm Calculation of Temperature Profiles on Sparse Grid Input:Sampling point ˜ξi, initial temperature Tiniand pdyn(r) Output:Stable temperature profiles T (r, ˜ξi) of ˜ξi
1 Begin
2 Obtain Tox(rxy, ˜ξi) and Lch(rxy, ˜ξi) according to ˜ξi; 3 T (r, ˜ξi) ← Tini;
4 T(r, ˜ξi) ← 0;
5 While (T (r, ˜ξi) − T(r, ˜ξi) ≤ Converging criterion) 6 T(r, ˜ξi) ← T (r, ˜ξi);
7 Update pleakage(r, ˜ξi, T ) by T (r, ˜ξi);
8 ptotal(r, ˜ξi, T ) ← pleakage(r, ˜ξi, T ) + pdyn(r); 9 † Solve deterministic thermal equations (12) and (13)
with ptotal(r, ˜ξi, T ) to obtain a new T (r, ˜ξi); 10 if(T (r, ˜ξi) = Infinite) then Thermal runaway; 11 Return T (r, ˜ξi)
12 End
† The deterministic thermal analyzer [18] is used to obtain T∗. Any deterministic thermal analyzer can be used here.
Fig. 3. Deterministic electro-thermal analysis for each sampling point in sparse grid. pleakage, pdynand ptotalare the leakage, dynamic and total power density profiles for each sampling point of sparse grid, respectively.
subject to the following boundary condition
κ(rbs, T )
∂T (rbs, ˜ξm) ∂nbs
+ hbsT (rbs, ˜ξm) = fbs(rbs). (13)
Here, p(r, ˜ξm, T ) and T (r, ˜ξm) are deterministic power density
and temperature profiles with respect to ˜ξm, respectively. Since
the power density profile is temperature dependent in equa-tion (12), the deterministic electro-thermal analysis is used to get each T (r, ˜ξm) and is summarized in Fig. 3.
D. Polynomial Interpolation of Temperature Distribution Instead of directly using equation (10) which requires to obtain different Qi1(T ) ⊗· · ·⊗ Qid(T ) for each different |i| = i1+· · ·+ id, we take the advantage of nested sparse grid structure and then
perform one time of Newton interpolating method [14] to globally interpolate T (r, ˜ξ) by the deterministic temperature profiles of all sampling values in sparse grid. For the sparse grid that can not preserve nested structure, the Newton interpolating method can be applied to obtain each different Qi1(T ) ⊗· · ·⊗ Qid(T ).
Based on the Newton interpolating formula, the temperature at the specified die position r∗ can be approximated as
T (r∗, ˜ξ) ≈m=N m=0
ˆam(r∗)φm(˜ξ), (14)
where φm(˜ξ) is an interpolating polynomial with respect to the
m-th sampling value ˜ξm, and its form can be found in [14]. The
N = |H(q, d)| − 1, |H(q, d)| is the number of sampling values in sparse grid, and ˆam(r∗)s need to be determined.
Based on the basic idea of interpolation that the approximated function must match each known data, the interpolated polyno-mial in (14) must satisfy equation (15) for each ˜ξk.
m=N m=0
ˆam(r∗)φm(˜ξk) = T (r∗, ˜ξk). (15)