Aggregation-Based Multilevel Solver - Aggregation-Based Algebraic Multigrid

Aggregation-Based Algebraic Multigrid

3.6 Aggregation-Based Multilevel Solver

This section we introduce the overall solver of our proposed method. The system equation derived in Section 3.4 is shown below

2C_n

h + Gn+h

2A^T_l_nL⁻¹Aln

vn(t + h) = 2C_n

h − Gn− h

2A^T_l_nL⁻¹Aln

vn(t) + ˜B(˜u(t + h) + ˜u(t)) + 2A^T_l_ni_l(t) +hA^T_l

nL⁻¹L_EV_E+ 2G_EV_E (3.6)

i_l(t + h) = i_l(t) − h

2L⁻¹A_l(v_n(t + h) + v_n(t))

+hL⁻¹L_EV_E (3.7)

At first, we apply the aggregation-based AMG cycle construction procedure to the system matrix of^2C_hⁿ + G_n+ ^h₂A^T_l_nL⁻¹A_l_nin Equation 3.5. Then, we apply AbAMG solver to calculate the value of v_n(t + h). With the value of v_n(t + h), we can get the value of i_l(t + h) from Equation 3.6. Recursive calculating Equation 3.5 and 3.6, we can solve the power network problem and get the voltage waveform of the analysis voltage nodes. The mapping operator construction of AbAMG is determined from the global information of system and only needs to be performed once for all time step calculation.

The experimental results are shown in Chapter 4.

Chapter 4 Experimental Results

This chapter demonstrates the speed and accuracy of our proposed AbAMG solver and compares our results with other methods. The power delivery networks are randomly generated as mesh networks which consist of lumped RLKC segments and many current sources. This work is implemented in C++ language and test on a Pentium IV 3.4-GHz machine with 3 GB memory.

First, an efficient and accurate time domain solver InductWise [24] is used to demon-strate the accuracy of our method. The accuracy of RLKC circuits is shown in Table 4.1.

In Table 4.1, Min V means the minimum voltage drop with respect to each test circuit.

The minimum voltage drop of each test circuit is above 0.832V. The maximum error is within 0.973% for each RLC test circuit and the average error is less than 0.067% for AbAMG with compensation. These results demonstrate the excellent accuracy of our algorithm.

To show the efficiency of our AbAMG solver, the analysis of DC and 50 transient time steps are executed and the results are compared with three state-of-the-art methods, IEKS [11], InductWise [24] and standard AMG. The comparison results are shown in

Standard AMG AbAMG without compensation AbAMG with compensation Circuit Size Min V Max Error (%) Avg Error (%) Max Error (%) Avg Error (%) Max Error (%) Avg Error (%)

49.6K 0.85 1.173 0.077 1.165 0.077 0.973 0.067

199.2K 0.832 1.107 0.066 1.059 0.065 0.885 0.058

448.8K 0.835 1.17 0.065 1.072 0.064 0.955 0.056

798.4K 0.842 1.13 0.067 1.081 0.067 0.967 0.059

Table 4.1: Error percentage of RLKC circuits

InductWise [24] IEKS [11] Standard AMG Ours Result

Circuit Size RT(s) Mem(MB) RT(s) Mem(MB) RT(s) Mem(MB) RT*(s) RT**(s) Mem(MB)

49.6K 78.34 111 6.25 68 3.953 46 3.593 2.972 40

199.2K 391.7 424 29.76 308 15.875 182 14.235 12.719 156

448.8K 1576 994 82.56 747 38.187 407 32.594 28.312 351

798.4K 2903 1547 131.31 1230 68.219 721 59.328 51.812 624

1.248M × >3000 × × 105.59 1130 93.75 83.156 974

1.7976M × >3000 × × 152.36 1627 137.312 119.36 1401

2.4472M × >3000 × × × × 196.531 167.6 1907

Table 4.2: Runtime of RLKC circuits. “×” denotes this methodology failed.

Speed up

Circuit Size SIn SIEKS SAM G SN o

49.6K 26.36 2.1 1.33 1.21

199.2K 30.8 2.34 1.25 1.12

448.8K 55.7 2.92 1.35 1.15

798.4K 56 2.53 1.32 1.15

Table 4.3: Speed up of AbAMG compared to other methods

Table 4.2 for different RLKC circuits. In Table 4.2, RT is the CPU run time and Mem means the memory usage. RT* means the run time of AbAMG without compensation and RT** represents the run time of AbAMG with compensation. The speedup of our method for each test circuit case is shown in Table 4.3. In Table 4.3, SIn, SIEKS, SAM G and SN o

are the speedup of AbAMG with compensation respect to InductWise, IEKS, standard AMG and AbAMG without compensation. The significant speed improvement, 26 times faster than the InductWise [24], 2 times faster than IEKS [11] and 1.27 times faster than standard AMG, and less memory usage, two fifth of the memory usage in [24] and half of the memory usage in [11], are observed.

A plot of CPU time versus circuit size for each method is shown in Fig. 4.1, we can observe that the CPU time of AMG-based methods are proportional to circuit size and the AbAMG method without compensation has best performance. The memory usage versus circuit size of each method is plotted in Fig. 4.2. The memory usage of AMG-based methods are proportional to the circuit size and AbAMG method spends least memory.

The proposed AbAMG solver can solve the DC and transient nodal voltages of a circuit with the circuit size being 2.4472M in 167.6 CPU seconds, and this indicates that the proposed simulator is very efficient in solving power delivery networks and capable of handling more than two million circuit size.

0 500000 1000000 1500000 2000000 2500000 0

20 40 60 80 100 120 140 160 180 200 220

IEKS Standard AMG

AbAMG without compensation AbAMG with compensation

CPU Time (sec)

Circuit Size

Fig. 4.1: Run Time versus Circuit Size

Standard AMG AbAMG

Circuit Size Fine Grid NZ Coarse Grid NZ Cycle Coarse NZ Cycle* Cycle**

49.6K 119.5K 92K 118 33.8K 127 101

199.2K 480K 370K 113 135.6K 131 100

448.8K 1082K 834K 111 305.4K 131 101

798.4K 1925K 1484K 114 543K 135 103

1.248M 3009K 2320K 109 849K 131 102

1.7976M 4335K 3342K 108 1223K 132 100

Table 4.4: Comparison between AbAMG and standard AMG

A comparation between standard AMG and AbAMG is shown in Table 4.4, Fine Grid NZ , Coarse Grid Nz, Cycle* and Cycle** are the non-zero terms of original fine grid, non-zero terms of coarse grid, total number of multilevel cycle of AbAMG without compensation and total number of multilevel cycle of AbAMG with compensation. The plot of non-zero terms versus circuit size and total multilevel cycles versus circuit size are shown in Fig. 4.3 and Fig. 4.4. The coarse grid Nz of AbAMG is only one third of standard AMG and the number of cycle of AbAMG with compensation is smaller than standard AMG.

0 500000 1000000 1500000 2000000 2500000 0

200 400 600 800 1000 1200 1400 1600 1800 2000

Memory Usage (MB)

Circuit Size

InductWise IEKS Standard AMG AbAMG

Fig. 4.2: Cpu Time versus Circuit Size

0 500000 1000000 1500000 2000000 2500000 0

1000 2000 3000 4000 5000 6000

Non-zero Terms (K)

Circuit Size

Fine Grid

Coarse Grid of Standard AMG Coarse Grid of AbAMG

Fig. 4.3: Non Zero Term versus Circuit Size

0 500000 1000000 1500000 2000000 2500000 100

102 104 106 108 110 112 114 116 118 120 122 124 126 128 130 132 134 136 138

Total Cycles

Circuit Size

Standard AMG

AbAMG without Compensation AbAMG with Compensation

Fig. 4.4: Total Cycles versus Circuit Size

Chapter 5 Conclusions

In this thesis, we present an aggregation-based algebraic multigrid solver for the power/ground distribution network analysis. Different from the traditional algebraic multigrid solver, our AbAMG solver constructs the inter-grid mapping operators from the global infor-mation of the original system matrix. By performing an aggregation algorithm, we can perform an algebraic partition to the original system. With the matrix compensation al-gorithm and the global error estimation process, we can get the modified sub-matrices from the original system and calculating the approximated eigenvector to constructed the global-considering inter-grid mapping operators.

Experimental results show that the proposed methodology can handle circuit size more than two million in 167.6 CPU seconds. The maximum error of each RLKC test circuit is less than 1%. The significant speed improvement and the less memory usage show our AbAMG methodology is very suitable for analyzing the power delivery network.

The global construction of mapping operator improves the performance of AbAMG, con-structs smaller coarse grid and converges with smaller cycles than standard AMG.

Appendix A

在文檔中集合體基礎的代數多重網格方法在晶片上功率網路分析上的應用 (頁 48-57)