集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

(1)

國

立

交

通

大

學

電信工程學系

碩士論文

集合體基礎的代數多重網格方法在晶片

上功率網路分析上的應用

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

研究生：周桓宇

指導教授：李育民教授

(2)

集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

研究生：周桓宇 Student：Huan-Yu Chou

指導教授：李育民 Advisor：Yu-Min Lee

國立交通大學

電信工程系

碩士論文

A Thesis

Submitted to Department of Communication Engineering College of Electrical Engineering and Computer Science

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Master

in

Communication Engineering

July 2006

Hsinchu, Taiwan, Republic of China

(3)

集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

學生：

周桓宇

指導教授

：李育民

國立交通大學電信工程學系碩士班

摘

要

隨著次深次微米技術演進到 0.18 微米以下，晶片上功率傳輸網路的分析，已

經變成在今日的高效能晶片設計下的ㄧ個非常重要而且具挑戰性的ㄧ個問題。功

率傳輸網路上較低的電源電壓，將會減少電路的雜訊容忍度。除此之外，較高的

電路操作頻率，將使得由 Ldi/dt 電壓壓降而來的電路雜訊為之增加。這些效應將

會增加功率傳輸網路的設計複雜度以及對於有效率的功率傳輸網路分析方法的

需求。

在本篇論文當中，對於功率傳輸網路分析，我們提出了ㄧ個以集合體基礎的

代數多重網格分析方法。首先，我們將原始的功率傳輸網路模擬成許多 RLKC 元

件以及片段線性的電流源。然後，利用修飾節點分析方法，我們可以把原始問題

轉換成一個 Ax=b 的線性代數問題。在此，A 是一個

n×n

的矩陣，x 和 b 是

n×1

的

向量。在對於這個線性代數問題，應用了我們的集合體演算法之後，我們可以把

原始的系統矩陣分成許多小的子矩陣，並實行一個代數的切割去簡化問題。

實驗結果顯示出，我們的集合體基礎的代數多重網格方法，在時間和記憶體

方面，比較傳統的代數多重網格方法以及現存的改善 Krylov 子系統方法，都得

到了更好的結果。

i

(4)

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

Student：Huan-Yu Chou

Advisor：Dr.

Yu-Min Lee

Department of Communication Engineering

National Chiao Tung University

ABSTRACT

As the ultra deep sub-micron technology scales down to 0.18 µm, power

distribution network analysis becomes one of the most critical and challenging

problems in today’s high performance chip design. Lower supply voltage on power

distribution network decreases the circuit noise margin and higher circuit operation

frequency increases the circuit noise from Ldi/dt voltage drop. Those effects increase

the design complexity of power distribution network and also increase the demand of

efficient power distribution network analysis methods.

In this thesis, we present an aggregation-based algebraic multigrid method for

power distribution network analysis. First, we model the original power distribution

network with RLKC segments and piecewise linear current sources. Then we use

modified nodal analysis to transform the problem into an Ax=b linear algebraic

problem where A is a

n×n

matrix, x and b are

n×1

vectors. By performing an

aggregation algorithm, the original system matrix is divided into many small

sub-matrices and an algebraic partition is performed to simplify our problem.

Experimental results show our aggregation-based algebraic multigrid method runs

faster and spend less memory usage than both traditional algebraic multigrid method

and the existing IEKS (Improve Krylov Subspace) method.

(5)

誌

謝

在這篇論文能順利完成的當下，我在此首先最要感謝的就是我的

指導教授李育民博士的指導，老師的專業背景和認真的研究態度，

總是能適時的指出學生在研究時所無法發現的盲點，從而做出正確的

修正。在碩士生捱所學到的研究態度不只是得到這個學位，相信在之

後的人生中，也是學到了應該正確面對事情的態度。

另外，我在此還要感謝實驗室的學弟培育。因為研究領域的相

似，在相互討論的過程當中，令我得到了很多非常寶貴的意見，也觸

發了很多的想法。

然後，還要感謝同級的震軒、義琅，以及博班學長至鴻在研究

以及修課方面的協助和陪伴，讓我的碩士生涯過的更加愉快。

最後我要深深感激的，就是在碩士生涯中支持我的家人，因為有

了你們的支持，我才有可能拿到這個學位。

(6)

List of Figures

1.1 An Example of IR Drop Voltage Noise . . . 2

1.2 The Structure of Power Network. . . 4

2.1 RLC Circuit . . . 9

2.2 RLC Circuit After Step 1,2 . . . 9

2.3 Decompositions of Matrix A . . . 14

2.4 Smoothing of Random Error by Gauss-Seidel Iteration . . . 15

2.5 Standard Geometry Coarsening of a Regular Mesh . . . 16

2.6 Two-Level Solution Method . . . 18

2.7 The Multigrid V-Cycle . . . 19

2.8 Example of Color Scheme. . . 21

2.9 The Flowchart of Traditional AMG . . . 24

3.1 Flowchart of Our Proposed Method . . . 27

3.2 Comparison between Our Method and Previous Methods . . . 30

3.3 Original System Matrix . . . 35

3.4 Example of Aggregation . . . 36

3.5 System Matrix After Aggregation . . . 37

3.6 Structure of Aggregation of the Size of 3. . . 38

3.7 Coarse Grid Construction . . . 40

4.1 Run Time versus Circuit Size . . . 44

4.2 Cpu Time versus Circuit Size . . . 45

4.3 Non Zero Term versus Circuit Size . . . 46

(8)

List of Tables

2.1 Algorithm of Color Scheme . . . 20

3.1 Algorithm of Aggregation . . . 34

3.2 Algorithm of AggreConstruct . . . 34

3.3 Algorithm of Matrix Compensation . . . 39

4.1 Error percentage of RLKC circuits. . . 42

4.2 Runtime of RLKC circuits. “×” denotes this methodology failed. . . 43

4.3 Speed up of AbAMG compared to other methods . . . 43

(9)

Chapter 1 Introduction

This chapter gives an introduction of this thesis. Since our research topic is to develop

an efficient analysis tool for on-chip power/ground distribution network. We discuss the

basic concepts of on-chip power/ground distribution network in the beginning. The role

of an on-chip power/ground distribution network is to supply stable voltage references to

the on-chip circuitry and ensuring reliable operation of today’s high performance

micro-processors. However, as the ultra deep sub-micron technology scales below 0.18 µm,

cir-cuits with increasingly higher speed are being integrated with increasingly higher density.

Higher device densities and faster switching frequencies cause large switching currents

to flow in the on-chip power/ground distribution network, and will cause larger voltage

fluctuations due to IR drop and Ldi/dt noise which degrade the performance and

relia-bility of the circuit. High average currents flowing through the power/ground distribution

network may cause the undesirable electromigration effect which will degrade the

cir-cuit’s reliability. The descriptions of the voltage fluctuations and electromigration effect

are shown in the following [1] [2] [3]

• IR drop voltage noise: An example of the IR drop voltage noise is shown in

Fig. 1.1. IR drop mainly results from the resistance of the on-chip power network

where I represents current. If large current flows through the power network, an

un-acceptable voltage drop occurs. Large IR drop results from large current must

(10)

● ●

V

_IR-drop

V

_DD

I

Branch current

Fig. 1.1: An Example of IR Drop Voltage Noise

• Ldi/dt voltage noise: Ldi/dt noise occurs from a sudden change of current

flow-ing through a power network. With higher operation frequency of today’s high

performance IC design, Ldi/dt voltage noise becomes larger. Ldi/dt noise also

results from mutual inductance coupling effect. Two parallel wires may cause large

Ldi/dt voltage drop noise with each other.

• Electromigration: Electromigration effect results from a conductor with too much

current flowing through it and hence, the displacement of metal atoms due to

electron-flux. This behavior will cause shorts or opens in the metal lines, and degrading the

circuit’s reliability.

As the supply voltage scaling to control the power dissipation in the circuit [4], the

noise margin of the on-chip power/ground distribution network are sensitive to the voltage

(11)

With these reasons, the analysis of on-chip power/ground distribution network has

be-come a critical issue of today’s high performance IC design. In order to precisely predict

the voltage distribution and correctly simulate the behavior of the on-chip power/ground

distribution network, we model the active devices between power and ground

distribu-tion networks as time-varying current sources and gate capacitances [14]. By the way,

the power distribution network and ground distribution network can be separated for

sim-plicity. We focus on the simulation of power distribution network in this thesis and this

method can be extended to the ground distribution network analysis in the same manner.

The power distribution network is usually an irregular mesh and is modeled as RLKC

segments where R, L, and C represent the stamping matrix of resistors, inductances, and

capacitances, and K represents the susceptance matrix [22] [23] which is defined as the

inversion of L. The structure of power distribution network is shown in Fig. 1.2. Since

mutual inductance coupling has long range effect [5] which means that the coupling

be-tween two parallel wire segments decays very slowly with their separated distance, and

generates a dense matrix of L, for simplicity, the mutual inductances coupling effects are

not shown in Fig. 1.2.

The rest of this chapter is organized as following. In Section 1.1, we compare the

existing analysis methods for power network analysis and state our research motivation.

Our contributions and the organization of this thesis are presented in Section 1.2 and 1.3.

1.1 Motivations

In this section, we compare the existing analysis methods for power network and state our

research motivation. With the ultra deep sub-micron technology, several features of chips

(higher operating frequency, larger number of transistors, smaller feature sizes of

transis-tors and lower supply voltages) have made the integrity issues of power delivery network

become a key issue of high performance designs [6][7][8]. Generally, the power

(12)

V_dd Power Supply

On-Chip resistance and inductance

Time-Varying Current Sources Gate Capacitances

Fig. 1.2:The Structure of Power Network

highly efficient analyzers. Thus, the general circuit solvers such as SPICE by using direct

methods are not suitable for the power delivery analysis. In the past years, various efficient

methods have been proposed for the power delivery network analysis. The preconditioned

conjugate gradient (PCG) method is applied for solving power grid analysis in [9]. The

hierarchical methods are developed in [10][11]. The improved extended Krylov subspace

(IEKS) method developed in [11] extends the model order reduction technique to deal

with time-varying current sources without the moment shifting procedure. Multigrid-like

methods are developed in [12][13] to map the original problem to a reduced system with

smaller size by using the circuit’s geometry properties. However, these frameworks

pro-posed in [12][13] are hard to handle the coupling effects of mutual inductances. Hence, an

adaptive algebraic multigrid (AMG) method is used in [14] to analyze the power network.

It reformulates the system matrix and views the problem as an algebraic problem which

doesn’t need the geometry information. With these properties, the AMG based method

can handle mutual inductance coupling effects.

(13)

equation, Ae ≈ 0, where A is the system matrix and e represents the error vector. The

quality of mapping operator strongly depends on the choice of coarse grids and the

con-structed mapping operators only contain the local information of A. The mapping

opera-tors of AMG may lose a few of important error terms because of the inadequate choice of

coarse grids, hence, degrade the convergence rate. Therefore, an adaptive choice method

of coarse grids is developed in [14] to improve the above undesirable behavior. However,

this method needs to construct the mapping operators at each time step and may boost the

CPU time. To solve this problem, our aggregation-based algebraic multigrid (AbAMG)

method contains a global mapping operator construction procedure.

The idea of our mapping operator construction is based on the aggregation AMG

method used in [25] [26][27]. Aggregation methods originated in economics [28], where

similar products are considered together instead of individually. This procedure allows

significant reduction in the problem size, and maintaining accurate representation of the

overall behaviors. In multigrid terminology, the coarse grid is selected as a collection of

subsets of the fine grid. An algebraic partition is performed to the original fine grid and the

original system matrix is partitioned into several aggregated sub-matrices. The mapping

operators of aggregation AMG method are constructed from the system’s global

eigen-decomposition property. Generally, the error in the direction of an eigenvector associated

with a large eigenvalue is rapidly reduced by relaxation and the error in the direction of an

eigenvector associated with a small eigenvalue is reduced by a factor that may approach 1

as the eigenvalue approaches 0 [20]. The eigenvectors associated with small eigenvalues

of each sub-matrix are calculated to approximate the smooth error components of the

original system matrix and the mapping operator P is composed by these eigenvectors.

With accurate calculation of these eigenvectors, the mapping operator can project the

original system to a better transformed system than traditional AMG in [14] and achieving

better convergence rate. However, the eigen-decomposition complexity of the aggregated

(14)

In Chapter 3, we will show that the system matrix of the power delivery network

analysis problem has resistance dominate property when determining the aggregation.

The maximum matrix size of each sub-matrix of the original system is less than 4 and

the analysis problem has excellent property for aggregation AMG. The mapping

opera-tor construction of our proposed AbAMG method is based on the concept of aggregation

AMG and an innovative matrix compensation algorithm with a global error estimation

procedure is performed to further improve the quality of the mapping operators. The

mapping operator construction procedure of AbAMG is independent of the choice of

coarse grids and it only needs to be performed once for all time steps. With these

prop-erties, the AbAMG method can construct better mapping operators than the traditional

AMG method, and achieving better performance for solving the power delivery network

problem.

1.2 Our Contributions

This section we discuss our contributions in the following aspects

• The practicability of aggregation AMG to power network analysis: In this

the-sis, we discuss the practicability of the aggregation AMG to power network

analy-sis problem. Although the computation complexities of the eigen-decomposition

procedures of the aggregated sub-matrices grow rapidly with the matrix size. We

discuss the resistance dominate property when determining the aggregation of the

system matrix of the power delivery network. The maximum aggregation size is

only of 4 and the aggregation AMG method is efficient to analysis the power

de-livery network problem. We discovery the practicability of the aggregation AMG

to power network analysis and provide a new idea of AMG method to analysis the

power delivery network problem.

• Global error estimation: Although the real error of the analysis system is

(15)

relaxation process to a problem with known solution, we can obtain the information

about the troublesome error. The homogeneous equation, Ax = 0, serves us well

for this purpose where A is the system matrix and x is the solution vector. Our

global error estimation process begins by applying iterative method i times to the

homogeneous equation with a random initial guess x0. The resulting solution vector

xi can provide us the information about global error distribution. The error vector

contains information of the algebraic property of system matrix A and improves the

construction of AMG inter-grid mapping operators. The detail discussion of global

error estimation and mapping operators construction will be shown in Section 3.4.

• Global mapping operator construction: In contract to AMG, AbAMG constructs

the mapping operator from the global information of A. An aggregation algorithm

is performed to partition the original system matrix A into many sub-systems and

the approximated eigenvectors of A can be calculated from each sub-system

lo-cally. Since each sub-system is not totally independent and has weak connections

with each other. An algebraic matrix compensation algorithm is performed to catch

the weak connection effects. The error vector generated during the global error

estimation is used in the matrix compensation algorithm. After the compensation

algorithm, we can construct a better mapping operator from the global information

of system and get better performance than standard AMG.

1.3 Organization of this Thesis

The rest of this thesis is organized as follows. Chapter 2 introduces the Modified Nodal

Analysis method and the mathematical background of multigrid and traditional AMG.

Chapter 3 describes our proposed algorithm flow. In Chapter 4, we compare the

ex-perimental results of AbAMG, traditional AMG and IEKS methods. Finally, we give a

(16)

Chapter 2 Preliminaries

This chapter introduces several mathematic background knowledge that will be used in

this thesis. The Modified Nodal Analysis (MNA) method [15] is illustrated in Section

2.1. By using MNA, we can get the system equation, and modeling the original problem

as a linear algebraic problem. The direct and iterative methods for solving a linear

alge-braic problem are discussed in Section 2.2. Finally, the theory of the multigrid method is

presented in Section 2.3.

2.1 Modified Nodal Analysis

MNA is very useful for large circuit analysis and is easier to implement algorithmically

on a computer. The analysis principles of it and an example for RLC circuit are shown

below.

• Principles of MNA: To apply the MNA to a circuit with n nodes, m voltage sources

and k inductances. We apply the following steps.

– Step 1: Name the n nodes and currents through each current source. – Step 2: Name the currents through each voltage source and inductance. – Step 3: Apply Kirchoff’s current law to the n nodes. We take currents out of

a node to be positive.

(17)

– Step 5: Solve the system of n + m + k unknowns. Example: Consider a RLC circuit shown below:

R₁ L₁

C₁

V₁(t) _I _V₂_(t)

1(t)

Fig. 2.1: RLC Circuit

Apply step 1 and step 2:

R₁ L₁ C₁ V₁(t) _I _V₂_(t) 1(t) V_a(t) Vb(t) Vc(t) i_v1(t) _i_v2_(t) i_L(t)

Fig. 2.2: RLC Circuit After Step 1,2

Apply step 3:

N ode a : iv1(t) + I1(t) +

Va(t) − Vb(t)

R1

(18)

N ode b : Vb(t) − Va(t) R1 + C1 dVb(t) dt + iL(t) = 0 (2.2) N ode c : iv2(t) − iL(t) = 0 (2.3) Apply step 4: Vb(t) − Vc(t) = L1 diL(t) dt (2.4) Va(t) = V1(t) (2.5) Vc(t) = V2(t) (2.6) Apply step 5:         1 R1 − 1 R1 0 0 1 0 −1 R1 1 R1 0 1 0 0 0 0 0 −1 0 1 0 −1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0                Va(t) Vb(t) Vc(t) iL(t) iv1(t) iv2(t)        +        0 0 0 0 0 0 0 C1 0 0 0 0 0 0 0 0 0 0 0 0 0 L1 0 0 0 0 0 0 0 0 0 0 0 0 0 0               Va(t)0 Vb(t)0 Vc(t)0 iL(t)0 iv1(t)0 iv2(t)0        =        −I1(t) 0 0 0 V1(t) V2(t)        (2.7)

Rearranging Equation (2.7), we can get the following equations:

        1 R1 − 1 R1 0 0 1 0 −1 R1 1 R1 0 1 0 0 0 0 0 −1 0 1 0 −1 1 0 0 0 −1 0 0 0 0 0 0 0 −1 0 0 0                Va(t) Vb(t) Vc(t) iL(t) iv1(t) iv2(t)        +        0 0 0 0 0 0 0 C1 0 0 0 0 0 0 0 0 0 0 0 0 0 L1 0 0 0 0 0 0 0 0 0 0 0 0 0 0               V_a0(t) V_b0(t) V_c0(t) i0_L(t) i0_v1(t) i0v2(t)        =        −1 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 −1          I1(t) V1(t) V2(t)   (2.8)

From Equation (2.8), the MNA circuit equations of a linear RLC circuit can be

rep-resented as following:

ˆ

Gx(t) + ˆC d

(19)

where ˆ G =    G −AT l −ATVE AT l 0 0 AT VE 0 0    ˆ C =    C 0 0 0 L 0 0 0 0    x(t) =    v(t) il(t) iVE(t)   

v(t) corresponds to the unknown nodal voltages. il(t) and iVE(t) correspond to the

branch currents flowing through inductors and independent voltage sources. G, C and L

represent the stamping matrices of the resistors, the conductors and the inductors. Aland

AVE correspond to the coefficient matrices related to the inductors and the independent

voltage sources. u(t) is the vector of independent voltage sources and the independent

current sources. B is the coefficient matrix related to u(t). Integrating Equation (2.9)

from time t to (t + h), we can get the following equation

ˆ G Z t+h t x(t)dt + ˆC Z t+h t dx(t) dt dt = B Z t+h t u(t)dt (2.10)

Applying trapezoidal approximation [15] with time step h to Equation (2.10), we have

ˆ G x(t + h) + x(t) 2 ! h + ˆC(x(t + h) − x(t)) = B u(t + h) + u(t) 2 ! (2.11)

Reformulating Equation (2.11), we have

ˆ G + 2 h ˆ C x(t + h) = − ˆ G − 2 h ˆ C x(t) + B(u(t + h) + u(t)) (2.12)

Equation (2.12) can be viewed as a linear algebraic problem of Ax = b, where A

is equal toG +ˆ 2_hCˆ. The solution of x can be solved iteratively with time step h and each time step encounters an Ax = b problem which can be solved by direct or iterative

(20)

2.2 Direct and Iterative Methods for Solving Linear

Equa-tion

Direct Methods:

When solving a linear algebraic problem

Ax = b (2.13)

where A is n ∗ n matrix, x and b are n ∗ 1 vectors. The simplest direct method is to

calculate the inverse matrix of A and the solution of x is

x = A−1b (2.14)

Another direct method is the LU decomposition method [16]. Substituting the A matrix

into a product of lower- and upper-triangular matrices:

A = " . .. 0 L . .. # " . .. U 0 . .. # (2.15) We have LU x = b (2.16)

In order to solve Equation (2.16), we substitute

y = U x (2.17)

such that

Ly = b (2.18)

So we first solve the Equation (2.18) by F orward Substitution to obtain y and then

solve the Equation (2.17) by Back Substitution to get the final solution x.

Direct methods can always get the answer of a linear algebraic problem with high

computational complexity. Most direct methods have computational complexities in

(21)

direct analysis methods are prohibitive due to computational complexity.

Iterative Methods:

In contract to direct methods, iterative methods solve a linear algebraic problem

it-eratively [17]. It gets the answer after several iterations. Considering a linear algebraic

problem Ax = b, iterative method splits the matrix A into the form

A = M − N (2.19)

where M and N are n ∗ n matrices. So Equation (2.13) becomes

(M − N )x = b (2.20)

Reformulating Equation (2.20), we have

x = M−1N x + M−1b (2.21)

From Equation (2.19), we can get

M−1N = I − M−1A (2.22)

Substituting Equation (2.22) into Equation (2.21), we can get a standard iterative formula:

xi+1= (I − M−1A)xi+ M−1b (2.23)

where xi_{is the value of x after i − th iterations.}

Substituting the real solution x to the both sides of Equation (2.23), we can get the

error propagation equation as the following

ei+1= (I − M−1A)ei (2.24)

where ei _{= x − x}i

The matrix A can be decomposed as A = D + L + U where D, L, and U are the

ma-trices of the diagonal, lower triangular, and upper triangular elements of A. The

decom-positions of A are shown in Fig. 2.3. For the Jacobi and Gauss-Seidel iterative methods,

the M matrix are substituted by D and D + L [17], and the iterative solving scheme of

(22)

• Jacobi Relaxation:

xi+1= xi+ D−1ri (2.25)

• Gauss-Seidel Relaxation:

xi+1 = xi+ (D + L)−1ri (2.26)

where ri _{means the residual after i times iterations and is equal to b − Ax}i

D

L

U

Fig. 2.3: Decompositions of Matrix A

Iterative method converges to the correct answer after several iterations and the

com-putational complexity is often n log n per iteration [17]. The efficiency of iterative method

depends on how fast it can converge to the correct answer. From Appendix A, we can

know that the error in the direction of an eigenvector associated with a large eigenvalue

is rapidly reduced by relaxation and the error in the direction of an eigenvector associated

with a small eigenvalue is reduced by a factor that may approach 1 as the eigenvalue

ap-proaches 0. The smooth error components must be solved by efficient solution methods.

An example of iterative method is shown in Fig. 2.4. We apply the Gauss-Seidel

iterative method to a power network analysis problem of dimension 260 with random

(23)

each node is shown in (a). The error of each node after 5, 20, 50 times of Gauss-Seidel

relaxations are shown in (b), (c), (d). The error components that are easy to be eliminated

by the iterative methods are defined as the oscillatory errors and the error components

that are hard to be eliminated by the iterative methods are defined as the smooth errors.

We can find that the Gauss-Seidel relaxations eliminates the error slowly and some nodes

have large errors which need 50 times of iterations to eliminate. In order to solve this

stalling behavior and make the problem converge faster, the multigrid method is proposed

[18]. 0 50 100 150 200 250 300 -0.01 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Error Node 0 50 100 150 200 250 300 -0.01 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Error Node 0 50 100 150 200 250 300 -0.01 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Error Node 0 50 100 150 200 250 300 0.00 0.02 0.04 0.06 0.08 0.10 Error Node

Initial Error Error After 5 Gauss-Seidel Relaxation Error After 20 Gauss-Seidel Relaxation Error After 50 Gauss-Seidel Relaxation (a) (b) (c) (d)

(24)

2.3 Multigrid Method

The earliest multigrid method is geometry multigrid (GMG) introduced by Brandt in

1973. It uses the geometry properties of system to construct a complementary multilevel

structure to overcome the stalling behavior in general iterative methods. The

complemen-tary multilevel structure is developed with two main ideas.

First, we know that the iterative method quickly eliminates oscillatory errors and we

must look for a process that can efficiently eliminate smooth errors. Since the global

com-putational complexity is proportional to the problem size, we transfer the original problem

from the original domain (fine domain) into a coarser domain such that we can attempt to

solve the problem there with cheaper computational cost. This domain transformation is

called the coarsening procedure and is determined by the geometry properties of system

in GMG. A standard geometry coarsening of a regular mesh is shown in Fig. 2.5. The

original analysis system is fine grid of larger dimension and the transformed system is

coarse grid of smaller dimension.

Fine Grid

Coarse Grid

(25)

Second, another difficulty is about how to represent the error on the coarser domain

since it is the quantity we do not know. Considering an algebraic equation, Ax = b, with

an approximation, ˆx, the residual is defined as

r = b − Aˆx = A(x − ˆx) = Ae (2.27)

So, although the error is unknown, it can be solved by using the residual equation, Ae = r.

If we calculate the residual r on the original fine domain and project this residual to the

transformed coarse domain. The residual equation of the coarse domain can be solved

and an error correction term ec can be obtained to correct the solution.

Based on these ideas, we can construct a two-level solution method as shown in

Fig. 2.6. First, we apply the iterative method to the equation Ax = b to eliminate the

oscillatory error components on the fine grid of dimension N . This step is also called the

relaxation step and the residual on the fine grids is calculated by r = b − Ax. Then,

the residual is restricted to the coarse grids with a smaller dimension M by rc _{= Rr,}

and the coarse grid operator is constructed by the Galerkin operator Ac = RAP . Here,

R is a M × N , P is a N × M matrix and R = PT_{. On the coarse grids, the residual}

equation, Ac_ec _{= r}c_{, is solved and the error correction term e}c_{is interpolated to the fine}

grids by e = P ec_{. The smooth error components not eliminated well by relaxation on the}

fine grid can be eliminated by the error correction term ec_{. A complementary two-level}

solution scheme can be constructed to overcome the stalling behavior of smooth error

components in general iterative methods. The correct solution is obtained by x = x + e,

and a post-relaxation step is applied on the fine grids to ensure that the oscillatory error is

not introduced through the coarse-grid correction step.

Applying the two-level solution method recursively, a multilevel solution method is

constructed and the coarsest residual equation can be solved with cheaper computational

cost. The multigrid V-cycle is shown in Fig. 2.7. The fine grid is labeled as level 1 and

the coarsest grid is labeled as level L. A relaxation step is first applied in the fine grid of

(26)

Relax on Ax = b Compute residual r= b - Ax Restrict rc_{= Rr} Correct solution x Å x + e Relax again Interpolate e = Pec SolveAc_ec_{= r}c ec_{= (A}c₎-1_rc

Fine Grid

Coarse Grid

A : N*N Ac _{: M*M} (M<N) R : M*N P : N*M Ac_{= RA}f_P

Fig. 2.6: Two-Level Solution Method

coarsest level is reached and the residual equation of level L is solved to get the correction

error term eL_{. Then, the error term e}L _{is interpolated to the fine grid of level 1 and the}

post-relaxation step is performed at each level.

Multigrid method constructs a complementary multilevel structure which can

effi-ciently eliminate all error components. The efficiency of multigrid method depends on

how to choose the coarse grid and determine the intergrid mapping operators P and R.

The mapping mechanism of GMG is easily determined with regular mesh but hard with

irregular mesh. In order to develop a more robust solving method, an algebraic multigrid

method is proposed in Section 2.3.1.

2.3.1 Traditional Algebraic Multigrid Method

AMG method was first introduced by Brandt in [19]. It is developed for solving problems

with irregular or unknown geometry properties. In contract to GMG, AMG uses only

information from the system matrix.

(27)

alge-Level 1 Relax Relax Relax Solve Relax Relax Relax Interpolate Interpolate Restrict Restrict 2 3 L

Fig. 2.7: The Multigrid V-Cycle

braic equation, Ax = b, AMG determines the inter-grid mapping operators, coarse grid

from the matrix A and the graph of it. Each row of the matrix A can be represented as

a node and its connection edge in a graph. The coefficients of the matrix A represent

the connections of the graph. For example, if |aij| = 0, there is no edge between node

i and j in the graph of A. If |aij| ≥ θ|aii|, we say that node j strongly influences i. If

|aij| ≤ θ|aii|, node j weakly influences i. Here, θ is a coefficient from 0 to 1 and is often

chosen to be 0.25.

With these definitions, we can construct the matrix graph of A and determine the

coarse grid by the color scheme algorithm [18]. This method begins by assigning a

measure to each node i of its potential quality to be a coarse node. The weight of node i

is determined by counting the number of nodes strongly influenced by node i. Then, we

choose the node i with maximum weight to be the starting coarse grid since it has good

potential to approximate other nodes. The nodes strongly influenced by node i are defined

as fine nodes since they can be approximated well by node i. It’s logical that the nodes

strongly influence the new fine nodes should be defined as coarse nodes since they can

approximate the new fine nodes well. Thus ,we increase the weights of the nodes strongly

(28)

Algorithm of Color Scheme

Input: The Graph of System Matrix A of Nodes 1, 2, ..., n

and the Related Weights w1, w2, ..., wnof These Nodes Output: The Sets of Coarse and Fine Nodes

1 Begin

2 NodeCounter=0

3 While NodeCounter!=n

4 MaxWeight=0, StartNode=1

5 For each node i

6 If node i is not defined as a coarse or fine node

7 If wi >MaxWeight

8 Then MaxWeight=wi, StartNode=i

9 EndFor

10 StartNode is defined as a coarse node, NodeCounter++ 11 For each node j that is strongly influenced by StartNode

12 If node j is not defined as a coarse or fine node

13 node j is defined as a fine node, NodeCounter++ 14 For each node k that strongly influences node j

15 If node k is not defined as a coarse or fine node

16 wk+ +

17 EndFor

18 EndFor

19 End.

Table 2.1: Algorithm of Color Scheme

the matrix graph of A are defined as coarse or fine nodes. The algorithm of color scheme

is shown in Table 2.1 and an example of it is given

Fig. 2.8 shows an example of color scheme. The description of each step is shown

below:

• Example of Color Scheme:

– Step a: A matrix graph of A is given with node number 1 to 14. – Step b: The weight of each node i is determined.

– Step c: Node 3 is defined as the starting coarse node with maximum weight

of 4 and node 1, 4, 13, 14 are defined as fine nodes. The weights of node 2, 5,

(29)

Strong connection Weak connection 1 3 4 2 3 3 3 2 1 1 1 2 2 2 6 14 3 12 11 13 9 4 10 2 1 5 7 8 1 F C 2 5 F 4 F 1 2 F 3 2 2 1 F C F C F 5 F 1 2 F 3 2 2 F F C F C F C F 1 2 F 3 2 2 F F C F C F C F 1 2 F C F 3 F F C F C F C F F 2 F C F C F F C F C F C F F C F C F C (a) (b) (c) (d) (e) (f) (g) (h)

Fig. 2.8: Example of Color Scheme

– Step d: Node 11 is defined as the new coarse node with maximum weight of

5 and node 12 is defined as new fine node. The weight of node 9 is increased

by 1.

– Step e: Node 9 is defined as the new coarse node with maximum weight of 5

and node 6 is defined as new fine node.

– Step f: Node 5 is defined as the new coarse node with maximum weight of 3

and node 7 is defined as new fine node. The weight of node 8 is increased by

1.

– Step g: Node 8 is defined as the new coarse node with maximum weight of 3

and node 10 is defined as new fine node.

– Step h: Node 2 is defined as the new coarse node with maximum weight of 2.

(30)

By using the color scheme algorithm, we can get the coarse grid and every fine node i

can be approximated well by the coarse nodes strongly influence i. However, the selected

coarse grid only considers local connections of each node i and may choose bad coarse

nodes that will decrease the convergence rate. To overcome this defect, we propose a

global mapping operator construction to build the global-considering coarse grids.

To further discuss the grid mapping operator, we continue the discussion of

inter-grid transfer operator. Since the key to the efficiency of the multiinter-grid method depends

on the complementarity of the relaxation and coarse-grid correction steps. We begin

the discussion of inter-grid transfer operator with the property of algebraic smoothness,

(Ae)i ≈ 0 which means that residual become small after several iterative iterations for

each row i. The equation can be rewritten as

aiiei ≈ −

X

j6=i

aijej (2.28)

We define that the DOFs of fine grid is C ∪ F , where C is the set of coarse-level nodes

and F is the set of remaining fine-level nodes. Rewriting Equation (2.28), we can get

aiiei ≈ − X j∈Ci aijej − X k∈Fi aikek (2.29)

where Ci = C ∩ Ni, Fi = F ∩ Ni, and Ni means the neighboring nodes of node i.

For further discussion, we divide the Fi into Fisand Fiw where Fisis the set of nodes

which strongly influence i in Fi, and Fiw is the set of nodes which weakly influence i in

Fi. Equation (2.29) can be rewritten as

aiiei ≈ − X j∈Ci aijej − X k∈Fs i aikek− X m∈Fw i aimem (2.30)

From Equation (2.30), we can try to define an interpolation structure since the ei for

each node is approximated by the neighboring coarse nodes Ci and fine nodes Fi. If we

can approximate the value of Fi as a sum of the values of Ci, ei can be approximated by

(31)

Since the values of F_is nodes are large compared to aii, we approximate ek by Ci in

the following form

ek≈ P j∈Ci akjej P l∈Ci akl (2.31)

Substituting Equation (2.31) into Equation (2.30) and adding the values of F_iw points

into aii, we can get the following equation

 a_ii+ X m∈Fw i aim  e_i = − X j∈Ci   aij + X k∈Fs i    aikakj P l∈Ci akl      ej (2.32)

From Equation (2.32), an interpolation formula for i ∈ F , ei = P j∈Ci wijej, can be defined with wij = − aij + P k∈Fs i   aikakj P l∈Ci akl   aii+ P m∈Fw i aim (2.33) The value of Fs

i is approximated by a sum of the value of Ci, and the value of Fiw

is simply added to aii. However, the selection of Fis and Fiw is fully determined by their

coefficients in Equation (2.30), and this would cause the bad choice of Fs

i and Fiw. Some

nodes of Fw

i with large errors should be labeled in the set of Fis. This behavior will

decrease the convergence rate of standard AMG. One of the main object of our AbAMG

is to overcome this defect.

After introducing the concepts of color scheme and weights calculation, the flowchart

of traditional AMG is shown in Fig. 2.9. At first, a cycle construction is performed to

construct the multigrid V-cycle. In the fine grid, a color scheme is performed to determine

the coarse grid and the weights of intergrid transfer operator can be calculated by Equation

(2.33). The coarse grid operator Ac can be derived from the Galerkin operator Ac =

RAfP . We apply these steps repeatedly until the coarsest grid operator is coarse enough.

After the step of traditional AMG cycle construction, we can derive the multigrid V-cycle

(32)

Input

Color scheme

Weights calculation

Coarse grid

construction

Can level (i+1) be solved? i=1 i=i+1 No Yes

Output

Level 1 2 3 L

Traditional AMG cycle construction

Multilevel solver

Fig. 2.9: The Flowchart of Traditional AMG

AMG construct the coarse grid and transfer operator from the property of system

matrix only. It can be applied to various types of problems without additional geometry

information. However, the construction of mapping operator R and P strongly depends

on the choice of coarse grid and contains only local information of the system. Another

algebraic multigrid method using the aggregation concept is introduced in Section 2.3.2.

2.3.2 Aggregation Algebraic Multigrid Method

Aggregation methods originated in economics [28], where similar products are

consid-ered together instead of individually. This procedure allows significant reduction in the

problem size, and maintaining accurate representation of the overall behaviors. In

multi-grid terminology, the coarse multi-grid is selected as a collection of subsets of the fine multi-grid. An

(33)

partitioned into several aggregated sub-matrices.

The idea of the mapping operator construction of aggregation AMG [25][26][27] is

based on the concept that the smooth error components are in the directions of the

sys-tem’s eigenvectors associated with small eigenvalues [27]. An algebraic partition is

per-formed in the aggregation AMG according to the connections of the nodes in the graph

of the system matrix A and the nodes with strong influence between them are clustered

together in an aggregation. A node-by-node aggregation algorithm is discussed in

Sec-tion 3.5.1. After the aggregaSec-tion procedure, an eigen-decomposiSec-tion procedure is

per-formed in each aggregated sub-matrix and the eigenvector related to the small eigenvalue

is used to compose the mapping operator P . With accurate calculation of the system’s

smooth error components, the aggregation AMG can achieve better convergence rate than

traditional AMG. However, the weak connected coefficients of small values between

ag-gregations are simply neglected or added to the diagonal elements in the aggregated

sub-matrices, and decreasing the convergence rate of aggregation AMG. An innovative

ma-trix compensation algorithm with a global error estimation procedure is proposed in our

(34)

Chapter 3 Aggregation-Based Algebraic Multigrid

In this chapter, we will introduce the algorithm flow of our proposed method. We first

state the problem formulation of the research in Section 3.1. In Section 3.2, we show

the algorithm flowchart of AbAMG and compare the main differences between it and

traditional AMG. The overview of our approach and previous works and the derivation of

analysis system equations are discussed in Section 3.3 and Section 3.4. Finally, the cycle

construction of our method and multilevel solver are discussed in Section 3.5 and 3.6.

3.1 Problem Formulation

The problem formulation of AbAMG for on-chip power network analysis can be

formu-lated as follows.

• Input: A RLKC network netlist and the independent voltage sources are given for

on-chip power network. The external current sources are modeled as time varying

piecewise linear current sources.

• Output: The voltage waveform of each node and the current waveform of each

wire segment are shown with respect to time.

(35)

3.2 Flowchart of Our Proposed Method

This section we show the flowchart of our proposed method and point out the main

differ-ences between our method and traditional AMG. The flowchart of our proposed method

is shown in Fig. 3.1 Input 10 9 3 4 7 2 11 6 ₁₂ 8 5 1 A₁ A₂ A₃ A₁ A₃ P₁ P₂ P₃ A₂ No Yes Output Global error estimation i=1 i=i+1 Aggregation Matrix compensation Coarse grid construction Multilevel solver

Can level (i+1) be solved?

Aggregation-based AMG cycle construction

Fig. 3.1: Flowchart of Our Proposed Method

At first, an aggregation-based AMG cycle construction is performed to construct the

multigrid V-cycle. In the fine grid, a global error estimation gives an estimation of global

null-space error. This information can be used to compensate the following aggregated

sub-matrices. Then, an aggregation algorithm is applied to partition the original system

matrix into many sub-matrices to localize the problem. After that, a matrix

compensa-tion algorithm compensates the sub-matrices and the intergrid transfer operator can be

(36)

the Galerkin operator Ac = RAfP . These steps are applied recursively to construct the

multigrid V-cycle and the answer of x can be solved by the multilevel solver.

The main difference between our method and traditional AMG is based on the cycle

construction step. Traditional AMG selects the coarse grid by the color-scheme algorithm

and begins the construction of intergrid transfer operator with the concept of algebraic

smoothness Ae ≈ 0. Our proposed method begins the cycle construction step with the

aggregation concept. Since most iterative methods quickly eliminate the components of

the error in the directions of the eigenvectors of the system matrix associated with the large

eigenvalues [20]. If we can get the eigenvectors of the system matrix A, the interpolation

operator can be composed by the eigenvectors associated with small eigenvalues and a

multilevel structure that can efficiently eliminate all error components can be constructed.

However, the real eigenvectors of the system matrix can’t be calculated directly. We

apply an aggregation algorithm to the original system matrix to perform an algebraic

partition and try to get the approximated eigenvectors of the system matrix from the

ag-gregated sub-matrices locally. The error estimation and matrix compensation steps are

used to improve the quality of the aggregated sub-matrices. The eigenvector associated

with the small eigenvalue in each aggregated sub-matrix is used to compose the intergrid

transfer operator R and P . By using aggregation-based AMG, we can project the original

system matrix into another domain to expose the low frequency errors in the original

do-main, and construct a smaller interpolation operator compared to traditional AMG with

the aggregation property.

3.3 Overview of Our Approach and Previous Works

In this section, we compare the existing published multigrid methods on the power

net-work analysis in Fig. 3.2. The multigrid method was first applied to the power netnet-work

problem by Nassif in [12][13]. It applied the geometry multigrid method and extended

(37)

method modeled the power network as the RC segments, and had problem with mutual

inductance coupling effects.

The concept of algebraic multigrid method was first applied to power network

prob-lem by Su. It used a geometry-like mapping operator constructing scheme to solve the

problem in short time. However, the model of this method didn’t include the mutual

inductance and the rough mapping operator constructing scheme lead to large errors.

Another algebraic multigrid solver is developed in [14]. It used the traditional

alge-braic multigrid method with RLKC model and an adaptive coarsening scheme is applied

to improve the convergence rate. However, the coarsening scheme must construct the

cycle at every time steps and will increase the CPU time.

Our proposed method developed an aggregation-based AMG method for power

net-work problem. Our proposed method used the RLKC model as [14]. We discuss the

practicability of the aggregation AMG to the problem and an additional matrix

compen-sation algorithm with a global error estimation method is applied to further improve the

convergence rate.

3.4 Derivation of System Equations

By using MNA, the system equation of an irregular power network can be formulated as

following ˆ Gx(t) + ˆC d dtx(t) = Bu(t) (3.1) where ˆ G =    G −AT l −ATVE AT l 0 0 AT VE 0 0    ˆ C =    C 0 0 0 L 0 0 0 0    x(t) =    v(t) il(t) iV (t)   

(38)

Adaptive coarsening scheme RLKC Algebraic multigrid Zhu [DAC03] Construct the mapping operators from system’s eigenvectors Geometry-like mapping operator construction Geometry coarsening scheme for irregular power network Features RLKC RLC RC Model AbAMG (Aggregation-based AMG) Algebraic multigrid Geometric multigrid Method Our proposed method Su [DAC03] Nassif [DAC00] [ICCAD01]

Fig. 3.2: Comparison between Our Method and Previous Methods

The definitions of ˆG, ˆC and x(t) are the same as Section 2.1. Since the independent

voltage sources are known, it is not necessary to solve the nodes of independent voltage

sources and the currents flowing through them. With this idea, the analysis dimension can

be reduced and the system equation can be rewritten as following

˜ G˜x(t) + ˜C d dtx(t) = ˜˜ B ˜u(t) + ˜GEvE(t), (3.2) where ˜ G = _G n −ATln AT ln 0 , C =˜ _C n 0 0 L , ˜ x(t) = _v n(t) il(t) , G˜E = _G E LE .

Here, vE(t), vn(t), and il(t) correspond to the vectors of the independent voltage

sources, the unknown nodal voltages, and the branch currents flowing through inductors,

respectively, Gn, Cn, and L represent the stamping matrices of the resistors not

(39)

coefficient matrix related to those inductors not connecting to vE(t), ˜u(t) is the vector

of independent current sources, and ˜B, GE, and AlE are the coefficient matrices related

to ˜u(t), the stamping of resistors between vn(t) and vE(t), and the connecting of L and

vE(t), respectively.

Applying trapezoidal approximation with time step h, Equation (3.2) can be

reformu-lated as following 2Cn h + Gn −A T ln Aln 2L h vn(t + h) il(t + h) = 2 GE AlE vE(t) + 2Cn h − G A T ln −Aln 2L h _v n(t) il(t) + ˜B _{u(t + h) + ˜}_˜ _u(t) 0 . (3.3)

After the time domain discretion, we can observe that the transient analysis system

matrix in Equation (3.3) is not symmetric and positive definite due to the introduction of

current variables.

Since the Multigrid method requires the matrix to be symmetric positive definite, some

extra processing is needed to reformulate the system matrix [21]. Similar to the method

used in [9], we split the variable vector into nodal voltage vector and branch current vector.

By using block matrix operations, we can decompose Equation (3.3) into two iteration

formulas for nodal voltages and branch currents. The system equations are reformulated

as following 2Cn h + Gn+ h 2A T lnL −1 Aln ! vn(t + h) = 2Cn h − Gn− h 2A T lnL −1 Aln ! vn(t) + 2ATlnil(t) + ˜B(˜u(t + h) + ˜u(t)) + hAT_l_nL−1AlEvE(t) + 2GEvE(t) (3.4) il(t + h) = il(t) − h 2L −1 Al(vn(t + h) + vn(t)) + hL−1AlEvE(t) (3.5)

Since the matrices Gn, Cn, and L−1(or K) are SPD, we can prove that the system

ma-trix of Equation (3.4) is still SPD. The L−1(K) [22] is sparser than the original L matrix,

and the above symmetric property can save 50% of the memory usage. Equation (3.4) is

equivalent to solve an Ax = b problem, where A is equal to (2Cn

h + Gn+ h 2A

T

(40)

and this problem can be solved by the two-level solution method, and the solutions of

Equation (3.4) and (3.5) are solved iteratively.

3.5 Aggregation-Based AMG Cycle Construction

In this section, a global mapping operator construction of our AbAMG is presented. At

first, a node-by-node aggregation algorithm is shown in Section 3.5.1. The practicability

of the aggregation AMG method to the power network analysis problem is discussed in

Section 3.5.2. The global error estimation procedure and matrix compensation algorithm

are stated in Section 3.5.3 and Section 3.5.4. Finally, the mapping operator construction

procedure is stated in Section 3.5.5.

3.5.1 Aggregation Algorithm

The purpose of the aggregation method in AMG is to reformulate the original system

matrix such that the smooth error components of the system can be calculated from the

modified system easily. Different from the difficulty of performing a geometry partition

with the complex mutual inductance coupling effects on the circuit topology, the

aggrega-tion method provides an easy approach to partiaggrega-tion the problem in the algebraic manner,

and simplifying the problem.

A node-by-node aggregation algorithm is discussed in this section. The definition of

strong connection between nodes i and j provides a good measurement when

determin-ing aggregations. Nodes i and j are defined to have a strong connection if there is any

strongly influence relation between them. If there is no strongly influence relation

be-tween node i and node j, we say that they have a weak connection. Node with maximum

number of strong connections acts a good candidate to be the starting node in the

aggre-gation algorithm and nodes with strong connections between them must be labeled in the

same aggregation since the value of aij is large with respect to aii. The nodes with weak

(41)

only be included in an aggregation. The rules of aggregation can be concluded in the

following

• Aggregation Rules:

– Select the node with maximum number of strong connections in the graph as

the starting node when determining an aggregation

– Every node must be included in an aggregation – Each node can not be labeled to different aggregation

– Let the nodes which have strong connections between them be labeled in same

aggregation

– Let the nodes which have weak connections between them be labeled in

dif-ferent aggregations

Considering a system equation, Ax = b, the aggregation algorithm is shown in Table

3.1 and 3.2.

An example of aggregation algorithm is shown in the following. Considering a system

equation Ax = b, the original system matrix A and the graph of it are shown in Fig. 3.3.

Fig. 3.4 shows an example of aggregation. The description of each step is shown

below:

• Example of Aggregation:

– Step a: A matrix graph of A is given with node number 1 to 12.

– Step b: The weight of each node i is determined by counting the number of

strong connections.

– Step c: Node 9 is defined as the starting aggregated node with maximum

(42)

Algorithm of Aggregation

Input: The Graph of System Matrix A of Nodes 1, 2, ..., n

and the Related Weights w1, w2, ..., wnof These Nodes Output: Aggregations 1, 2, ..., m

1 Begin

2 NodeCounter=0, AggCounter=0

3 While NodeCounter!=n

4 MaxWeight=0, StartNode=1

5 For each node i

6 If node i is not in an aggregation

7 If wi >MaxWeight

8 Then MaxWeight=wi, StartNode=i

9 EndFor

10 AggCounter++

11 j=AggCounter, StartNode is labeled in aggregation j

12 NodeCounter++

13 AggreConstruct(StartNode) 14 End.

Table 3.1: Algorithm of Aggregation

Algorithm of AggreConstruct

Input: The Node i and It’s Strongly Connected Nodes n1, n2, ..., ns

1 Begin

2 For Each strongly connected node k of node i

3 If node k is not in an aggregation

4 Node k is labeled in aggregation j 5 NodeCounter++

6 AggreConstruct(k)

7 EndFor

8 End.

(43)

10 9 3 4 7 2 11 6 12 8 5 1 Matrix Graph of A Strong connection Weak connection ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ 10 5 5 5 5 10 1 5 5 1 10 5 1 5 5 5 10 5 5 5 10 1 5 1 5 5 1 5 1 10 5 1 1 5 5 5 10 1 1 1 10 5 5 5 5 10 5 1 1 5 10 1 5 5 5 1 10 1 5 5 1 10 Original A

Fig. 3.3: Original System Matrix

– Step d: Node 6, 7, 11, 12 have strong connections to node 9 and are labeled

to aggregation a.

– Step e: Node 4 is defined as the starting aggregated node with maximum

weight of 3 and labeled to aggregation b.

– Step f: Node 2, 8, 10 have strong connections to node 4 and are labeled to

aggregation b.

– Step g: Node 1 is defined as the starting aggregated node with maximum

weight of 2 and labeled to aggregation c.

– Step h: Node 3, 5 have strong connections to node 1 and are labeled to

aggre-gation c. Every nodes are included in an aggreaggre-gation and the aggreaggre-gation step

is finished.

After the aggregation step, the modified system matrix A is shown in Fig. 3.5. We

can find that the nodes have strong connections between them are clustered together in

the same aggregation. The modified system matrix is reformulated into three aggregated

(44)

10 9 3 4 7 2 11 6 12 8 5 1 2 4 2 3 3 2 3 3 ₃ 3 2 2 (a) (b) (c) (d) 2 a 2 3 3 2 3 3 ₃ 3 2 2 2 a 2 3 a 2 a a _a 3 2 2 (e) (f) (g) (h) 2 a 2 b a 2 a a a 3 2 2 b a 2 b a b a a _a b 2 2 b a 2 b a b a a a b 2 c b a c b a b a a a b c c

Strong connection Weak connection

Fig. 3.4: Example of Aggregation

by the eigen-decomposition analysis of these sub-matrices. The weakly connected

coef-ficients between aggregations are simply added to the diagonal elements of the

aggrega-tions or neglected since their value is small compared to the diagonal elements. However,

some nodes with weakly connected coefficient may have large errors in the global view,

and decreasing the convergence rate. A global error estimation procedure is presented in

Section 3.5.3 to give an estimation of these troublesome error components and the related

matrix compensation algorithm is shown in Section 3.5.4 to improve this defeat.

3.5.2 Practicability of Aggregation AMG Method to Power Network

Analysis

In this subsection, we discuss the practicability of the aggregation AMG method to power

network analysis problem. The aggregation AMG method is applied to the Ax = b

prob-lem in equation (3.4), where A is equal to (2Cn

h + Gn+ h 2A T lnL −1_A

(45)

Aggregation a Aggregation b Aggregation c ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ 10 5 5 5 5 10 5 5 1 5 5 10 5 5 5 5 10 5 1 1 1 1 5 5 5 10 1 1 1 10 5 5 1 5 10 5 1 5 5 10 5 5 5 10 1 1 1 1 10 5 5 1 1 1 5 10 5 1 5 5 10 b a c b a b a a a b c c

Modified A Matrix Graph of A

Strong connection

Weak connection

Fig. 3.5: System Matrix After Aggregation

technology model (PTM) developed by the Berkeley university (http://www.eas.asu.edu/ ptm/),

the dimensions of R,L,C in the 0.13µm technology are shown as following, where R =

0.046ohm/µm, L = 1.69pH/µm, C = 0.13011f F/µm, length of each wire segments is

of 100µm. The contributions of the values of 2Cn

h , Gn, and h 2A T lnL −1_A

ln vary from 5E-3

to 8E-3, 2E-1 to 4E-1 and 2E-2 to 5E-2. The contribution of the value of the Gn term is

often 10 times larger than other terms of A. From the aggregation algorithm discussed in

Section 3.5.1, we can know that the determination of the aggregation of A is dominated by

the effects of the resistances of Gn. Since the on-chip power delivery network is of mesh

structure, most aggregations are of the size of 3 as shown in Fig. 3.6 and the maximum

size of the aggregation is less than 4 with via connected to the structure of Fig. 3.6.

3.5.3 Global Error Estimation

In this subsection, we introduce a global error estimation step. In this thesis, we can

know that the efficiency of multigrid method depends on the complementarity between

relaxation and coarse-grid correction. The error components not efficiently reduced by

(46)

com-L₃ 1 2 3 4 1_a 2_a 4_a 3_a 5 R₃ L₁ R₁ R₂ L₂ R₄ L₄

Fig. 3.6: Structure of Aggregation of the Size of 3

ponents are the quantity we do not know. A simple method to gain information about the

errors that relaxation does not efficiently reduce is applying the relaxation scheme to a

problem with known solution.

The homogeneous equation, Ax = 0, serves us well for this purpose. The real error

of Ax = 0 can be known since the exact solution of this equation is zero. By applying

relaxation several times to this equation with a random initial guess, we can get a error

vector, eG, which can represent the error component that the relaxation can not eliminate

well. This candidate error vector can provide information about troublesome error and be

used to compensate the aggregated matrices in our method. The compensation algorithm

will be introduced in Section 3.5.4.

3.5.4 Matrix Compensation Algorithm

In this subsection, we introduce a matrix compensation algorithm in this thesis.

Consider-ing a linear algebraic system equation AeG = 0 of dimension N . We begin the discussion

(47)

Algorithm of Matrix Compensation

Input: Original System Matrix A and Aggregations 1, 2, ..., n Output: Aggregated Sub-matrices A1, A2, ..., An

1 Begin

2 For each aggregation m

3 For each node i in aggregation m, sweep the i-th row of A

4 If node j is within aggregation m

5 Then Amij = Aij

6 Else

7 If Node j is a strong node,sweep the j-th row of A

8 Total = 0

9 For each column k in row j

10 If node k is within aggregation m

11 Then total+ = Ajk

12 EndFor

13 For each column k in row j

14 If node k is within aggregation m

15 Then Amik+ = Aij × Ajk/T otal 16 EndFor 17 Else 18 EndFor 19 EndFor 20 End.

Table 3.3: Algorithm of Matrix Compensation

Node i is defined as strong node if eGi > λmax(eGj)

n

j=1 and defined as weak node if

eGi < λmax(eGj)

n

j=1. Here, λ is a coeffient from 0 to 1 and is chosen to be 0.25 in our

algorithm.

With this concept, the matrix compensation algorithm is shown in Table 3.3. The weak

connected coefficients related to the strong nodes are approximated with the aggregated

nodes and the effects related to the weak nodes are simply neglected in our algorithm.

By using this algorithm, we can construct a better system for analysis and make each

aggregated sub-matrix independent from other sub-matrix. The compensation procedure

is exactly matched the weight calculation step in traditional AMG. The modified

集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

國

立

交

通

大

學

電 信 工 程 學 系

碩 士 論 文

集合體基礎的代數多重網格方法在晶片

上功率網路分析上的應用

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

研 究 生：周桓宇

指導教授：李育民 教授

集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

研 究 生：周桓宇 Student：Huan-Yu Chou

指導教授：李育民 Advisor：Yu-Min Lee

國 立 交 通 大 學

電 信 工 程 系

碩 士 論 文

集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用

學生：

指導教授

國立交通大學電信工程學系碩士班

摘

要

隨著次深次微米技術演進到 0.18 微米以下，晶片上功率傳輸網路的分析，已

經變成在今日的高效能晶片設計下的ㄧ個非常重要而且具挑戰性的ㄧ個問題。功

率傳輸網路上較低的電源電壓，將會減少電路的雜訊容忍度。除此之外，較高的

電路操作頻率，將使得由 Ldi/dt 電壓壓降而來的電路雜訊為之增加。這些效應將

會增加功率傳輸網路的設計複雜度以及對於有效率的功率傳輸網路分析方法的

需求。

在本篇論文當中，對於功率傳輸網路分析，我們提出了ㄧ個以集合體基礎的

代數多重網格分析方法。首先，我們將原始的功率傳輸網路模擬成許多 RLKC 元

件以及片段線性的電流源。然後，利用修飾節點分析方法，我們可以把原始問題

轉換成一個 Ax=b 的線性代數問題。在此，A 是一個

的矩陣，x 和 b 是

的

向量。在對於這個線性代數問題，應用了我們的集合體演算法之後，我們可以把

原始的系統矩陣分成許多小的子矩陣，並實行一個代數的切割去簡化問題。

實驗結果顯示出，我們的集合體基礎的代數多重網格方法，在時間和記憶體

方面，比較傳統的代數多重網格方法以及現存的改善 Krylov 子系統方法，都得

到了更好的結果。

An Aggregation-Based Algebraic Multigrid Method

for On-Chip Power Network Analysis

Student：Huan-Yu Chou

Advisor：Dr.

Yu-Min Lee

Department of Communication Engineering

National Chiao Tung University

ABSTRACT

As the ultra deep sub-micron technology scales down to 0.18 µm, power

distribution network analysis becomes one of the most critical and challenging

problems in today’s high performance chip design. Lower supply voltage on power

distribution network decreases the circuit noise margin and higher circuit operation

frequency increases the circuit noise from Ldi/dt voltage drop. Those effects increase

the design complexity of power distribution network and also increase the demand of

efficient power distribution network analysis methods.

In this thesis, we present an aggregation-based algebraic multigrid method for

power distribution network analysis. First, we model the original power distribution

network with RLKC segments and piecewise linear current sources. Then we use

modified nodal analysis to transform the problem into an Ax=b linear algebraic

problem where A is a

matrix, x and b are

vectors. By performing an

aggregation algorithm, the original system matrix is divided into many small

sub-matrices and an algebraic partition is performed to simplify our problem.

Experimental results show our aggregation-based algebraic multigrid method runs

faster and spend less memory usage than both traditional algebraic multigrid method

and the existing IEKS (Improve Krylov Subspace) method.

誌

謝

在這篇論文能順利完成的當下，我在此首先最要感謝的就是我的

指導教授 李育民博士的指導，老師的專業背景和認真的研究態度，

總是能適時的指出學生在研究時所無法發現的盲點，從而做出正確的

修正。在碩士生捱所學到的研究態度不只是得到這個學位，相信在之

後的人生中，也是學到了應該正確面對事情的態度。

電信工程學系

碩士論文

研究生：周桓宇

指導教授：李育民教授

研究生：周桓宇 Student：Huan-Yu Chou

國立交通大學

電信工程系

碩士論文

指導教授李育民博士的指導，老師的專業背景和認真的研究態度，