• 沒有找到結果。

Chapter 4 Result and Analysis

4.1 Profiling Result

For each simulation case, we first modify the application SW C++ program to make an ARM executable image file with CoWare ARM926T Bootcode and scatter load.

Second, we modify HW SystemC TLM models (if needed) to create a virtual platform using Platform Creator. Finally, we build and run simulation with ARM Symbolic Debugger (ASD), load an executable image file, enable analysis function from CoWare profiling utilities, and then wait for result [15].

In addition to the analysis reported by CoWare profiling utilities, we also gather some information by ourselves which had been introduced in chapter 3. But the information is displayed on ASD directly and will affect the performance analysis from CoWare profiling utilities. Thus we make two kinds of executable images for each simulation case, one is adding our profiling code in the functions to print out the

simulation information without analysis function; the other is without adding the code but just the analysis function from CoWare profiling utilities.

Our simulation flow for each case is listed below: first, combined SW golden functions with our modified functions to make sure the modification is functionally correct; second, remove SW golden functions and add our own profiling code to run again; third, remove our own profiling code and run simulation with the enabled analysis function form CoWare profiling utilities.

We simulate three supported modulation modes separately for each simulation case and combine the results together. The profiling results include function execution time, memory accesses, bus transactions, etc….

Cache size is one kind of factors that we are interested in. We can adjust both instruction and data cache sizes on the ARM926 PSP by Parameter Editor in Platform Creator. The unit of the size is in kilobytes. Since setting cache size to zero is illegal, we must make another executable image file to disable cache module for the case of without cache. For each simulation case, we have three different results with different cache sizes, which are zero cache, 4k bytes for both instruction and data cache, and 32k bytes for both instruction and data cache.

Table 4.1 is the function profiling result for simulation Case 1. We list five main functions in the column of function name, where the indent actions at the beginning of each function name shows the relationship between those functions. The function TX contains the other four functions, and the function OFDM Modulator contains the function IFFT. In other words, the three functions Modulation, STBC Encoder, and OFDM Modulator are called within the function TX, and the function IFFT is called within the function OFDM Modulator. The column of Total execution time in nanosecond is separated into three columns corresponding to three different cache sizes, and the total instruction count for each function is listed, too.

Table 4.1: Function profiling result in Case 1 Modulation

mode

Function name

Total execution time (Cache size) Instruction counts 0k 4k 32k

QPSK

TX 91013600 44740300 44067700 3328995

Modulation 242736 60016 58552 5376

STBC Encoder 3474940 1084860 985424 59417 OFDM Modulator 79216600 40976100 40527900 3099137 IFFT 74916100 39559600 39269800 3013449

16QAM

TX 91459200 44870500 44198600 3364008

Modulation 272984 85808 82688 8484

STBC Encoder 3474940 1084860 987000 59417 OFDM Modulator 79580700 41054300 40606800 3129506 IFFT 75280300 39637100 39348700 3043818

64QAM

TX 91501400 44996400 44293800 3376156

Modulation 318400 110976 106120 6912

STBC Encoder 3474940 1084730 988480 59417 OFDM Modulator 79502700 41103800 40626300 3140154 IFFT 75202300 39687300 39368100 3054466

Table 4.2: Memory accesses of functions in Case 1 (cache disabled) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 5926775 382603 343745

Modulation 28056 3456 1920

STBC Encoder 164303 18948 24579

OFDM Modulator 5347218 332080 277648

IFFT 5176510 315892 256836

16QAM

TX 5928111 383371 343745

Modulation 30514 4224 1920

STBC Encoder 164303 18948 24579

OFDM Modulator 5346928 332080 277648

IFFT 5176220 315892 256836

64QAM

TX 5949586 384139 343745

Modulation 43256 4992 1920

STBC Encoder 164303 18948 24579

OFDM Modulator 5352628 332080 277648

Table 4.3: Memory accesses of functions in Case 1 (4k caches) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 17125 96644 343745

Modulation 0 317 1579

STBC Encoder 237 14733 24578

OFDM Modulator 15884 53276 277644

IFFT 15260 34516 256836

16QAM

TX 18291 97170 343745

Modulation 0 0 0

STBC Encoder 237 14733 24578

OFDM Modulator 16900 52636 277644

IFFT 16252 33876 256836

64QAM

TX 18100 96571 343745

Modulation 0 0 0

STBC Encoder 237 14733 24578

OFDM Modulator 16676 51484 277644

IFFT 16028 32724 256836

Table 4.4: Memory accesses of functions in Case 1 (32k caches) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 2303 28358 343745

Modulation 0 0 0

STBC Encoder 358 8812 33365

OFDM Modulator 1844 12412 234647

IFFT 117 2285 73270

16QAM

TX 2338 29225 343745

Modulation 0 0 0

STBC Encoder 374 9589 33365

OFDM Modulator 0 0 0

IFFT 0 0 0

64QAM

TX 2340 29915 343745

Modulation 0 0 0

STBC Encoder 149 4565 24578

OFDM Modulator 1876 12756 298846

IFFT 1876 12756 298846

Table 4.2 is the memory access count for each function with disabled caches in simulation Case 1, and Table 4.3 and Table 4.4 show the same information in the same simulation case but running with 4k and 32k for both instruction and data caches. These tables are similar to Table 4.1 but the listed values are changed to ROM read access counts, RAM read access counts, and RAM write access counts. These access counts were gathered by additional profiling variables which accumulate the difference in total memory access counts from the counters in the memory model between the begin position and the end position during each function call.

Table 4.5 shows total memory read/write access counts in simulation Case 1 with different cache sizes. This information is gathered by running the other executable images without additional profiling variables and code in previous three tables.

Table 4.5: Total memory accesses in Case 1

Modulation mode QPSK 16QAM 64QAM

Cache disabled

ROM access counts

Read 6607357 6897407 7179063

Write 0 0 0

Total 6607357 6897407 7179063 RAM access counts

Table 4.6 is the bus transaction information in simulation Case 1 gathered by CoWare profiling utilities.

Table 4.6: Bus transaction information in Case 1 Modulation

mode Information Type Master Cache size

0k 4k 32k

QPSK

Transaction Counts IAHB 6609320 36936 24912

DAHB 912442 489451 421683

Transaction Throughputs (kB/s)

IAHB 258975 2958.45 2024.27 DAHB 35658.1 39191.8 34252.8 Bus Utilization (%) IAHB 53.038 0.605891 0.41457 DAHB 7.3221 8.028886 7.01739 Master Wait Total

(%)

IAHB 1.58846 0.0116303 0.00737214

DAHB 15.626 3.11136 3.08761

AVG. Waiting Masters 0.172145 0.0312299 0.0309498

16QAM

Transaction Counts IAHB 6891390 38109 25053

DAHB 946646 503391 434735

Transaction Throughputs (kB/s)

IAHB 258158 2900.02 1932.16

DAHB 35350.5 38296 33516.7

Bus Utilization (%) IAHB 52.8708 0.593925 0.395706 DAHB 7.26267 7.84529 6.86653 Master Wait Total

(%)

IAHB 1.5213 0.0115796 0.0069181 DAHB 15.4394 3.03037 3.01443 AVG. Waiting Masters 0.169607 0.0304195 0.0302134

64QAM

Transaction Counts IAHB 7165550 37606 24982

DAHB 980599 517299 447843

Transaction Throughputs (kB/s)

IAHB 258087 2723.58 1834.01 DAHB 35190.8 37454.4 32866.9 Bus Utilization (%) IAHB 52.8562 0.55779 0.375605 DAHB 7.23332 7.67282 6.73333 Master Wait Total

(%)

IAHB 1.51992 0.00998225 0.00661541

DAHB 15.3297 2.9779 2.95016

AVG. Waiting Masters 0.168497 0.0298788 0.0295677

The following tables show the simulation results for other simulation cases.

Table 4.7: Function profiling result in Case 2 Modulation

mode

Function name

Total execution time (Cache size) Instruction counts 0k 4k 32k

QPSK

TX 15316600 5629390 5351670 305841

Modulation 202832 166072 166072 3840

STBC Encoder 1126790 592280 582760 20490 OFDM Modulator 5908480 2239900 2066220 116444

IFFT 1608190 828416 808200 30760

16QAM

TX 15365800 5655700 5377680 307377

Modulation 202832 166024 166024 3840

STBC Encoder 1126790 592328 582792 20490 OFDM Modulator 5908480 2240100 2066730 116444

IFFT 1608190 828224 808208 30760

64QAM

TX 15425800 5682720 5406250 310449

Modulation 202832 166024 166024 3840

STBC Encoder 1126790 592328 582688 20490 OFDM Modulator 5908480 2240090 2067290 116444

IFFT 1608190 828224 808232 30760

Table 4.8: Memory accesses of functions in Case 2 (cache disabled) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 739811 54439 72565

Modulation 21889 1920 1920

STBC Encoder 60930 3075 5123

OFDM Modulator 271008 21324 25928

IFFT 100364 5132 5132

16QAM

TX 743392 55207 72565

Modulation 23041 2688 1920

STBC Encoder 60930 3075 5123

OFDM Modulator 271008 21324 25928

IFFT 100364 5132 5132

64QAM

TX 747559 55975 72565

Modulation 28033 3456 1920

STBC Encoder 60930 3075 5123

OFDM Modulator 271008 21324 25928

Table 4.9: Memory accesses of functions in Case 2 (4k caches) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 1086 52524 72565

Modulation 87 685 3883

STBC Encoder 77 2029 5122

OFDM Modulator 324 21916 25924

IFFT 148 3412 5128

16QAM

TX 966 53316 72565

Modulation 87 701 1939

STBC Encoder 77 2029 5122

OFDM Modulator 260 21916 25924

IFFT 84 3380 5128

64QAM

TX 992 54118 72565

Modulation 121 2279 4363

STBC Encoder 77 2029 5122

OFDM Modulator 260 21916 25924

IFFT 84 3380 5128

Table 4.10: Memory accesses of functions in Case 2 (32k caches) Modulation mode Function name ROM read RAM read RAM write

QPSK

TX 662 24908 72565

Modulation 0 0 0

STBC Encoder 0 0 0

OFDM Modulator 0 0 0

IFFT 0 0 0

16QAM

TX 651 25713 72565

Modulation 0 0 0

STBC Encoder 0 0 0

OFDM Modulator 545 14831 46525

IFFT 0 0 0

64QAM

TX 662 26436 72565

Modulation 0 0 0

STBC Encoder 0 0 0

OFDM Modulator 0 0 0

IFFT 0 0 0

Table 4.11: Total memory and HW models accesses in Case 2

Modulation mode QPSK 16QAM 64QAM

Cache disabled

ROM access counts

Read 1416541 1691722 1968445

Write 0 0 0

Total 1416541 1691722 1968445 RAM access counts Modulation HW access counts

Read 1536 1536 1536

Write 768 768 768

Total 2304 2304 2304

STBC Encoder HW access counts

Read 4096 4096 4096

Write 1024 1024 1024

Total 5120 5120 5120

IFFT HW access counts

Read 4096 4096 4096

Write 2052 2052 2052

Total 6148 6148 6148

There is a little difference between Table 4.11 and Table 4.5, because Case 2 of Table 4.11 uses three individual HW accelerators whose access counts are also listed.

Table 4.12: Bus transaction information in Case 2 Modulation

mode Information Type Master Cache size

0k 4k 32k

QPSK

Transaction Counts IAHB 1498210 23802 23562

DAHB 250097 192404 163484

Transaction Throughputs (kB/s)

IAHB 243941 9638.86 9852.26 DAHB 40584.1 77857.7 68299.3 Bus Utilization (%) IAHB 49.9592 1.97404 2.01774

DAHB 8.33972 15.9572 14

Master Wait Total (%)

IAHB 1.3294 0.0352477 0.0329697

DAHB 17.447 5.73128 5.86466

AVG. Waiting Masters 0.187764 0.0576653 0.0589763

16QAM

Transaction Counts IAHB 1765700 23691 23315

DAHB 283917 205496 176568

Transaction Throughputs (kB/s)

IAHB 244796 7643.2 7710.54

DAHB 39165.5 66250.7 58345.5 Bus Utilization (%) IAHB 50.1342 1.56533 1.57912 DAHB 8.06135 13.5777 11.9589 Master Wait Total

(%)

IAHB 1.19672 0.028213 0.0271596

DAHB 16.6372 4.93952 5.0181

AVG. Waiting Masters 0.178339 0.0496774 0.0504526

64QAM

Transaction Counts IAHB 2034920 24077 23677

DAHB 317763 218604 189700

Transaction Throughputs (kB/s)

IAHB 245490 6440.26 6467.13

DAHB 38093.6 58435 51775.3

Bus Utilization (%) IAHB 50.2763 1.31897 1.32447 DAHB 7.85089 11.9754 10.6116 Master Wait Total

(%)

IAHB 1.10768 0.0232272 0.0221519 DAHB 16.0548 4.39389 4.44984 AVG. Waiting Masters 0.171625 0.0441711 0.0447199 Table 4.13: Function profiling result in Case 3

Modulation mode

Function name

Total execution time (Cache size) Instruction counts 0k 4k 32k

QPSK TX 5176 3992 3992 123

16QAM TX 10168 8144 7856 243

64QAM TX 15160 12008 11720 363

Table 4.14: Memory accesses of functions in Case 3

Modulation mode Cache size Function name ROM RAM read read write

Table 4.15: Total memory and HW model accesses in Case 3

Modulation mode QPSK 16QAM 64QAM

Cache disabled

ROM access counts

Read 283810 559794 839382

Write 0 0 0

Total 283810 559794 839382 RAM access counts TX HW access counts

Read 0 0 0

Write 24 48 72

Table 4.16: Bus transaction information in Case 3 Modulation

mode Information Type Master Cache size

0k 4k 32k

QPSK

Transaction Counts IAHB 300599 5771 5731

DAHB 38973 15789 15755

Transaction Throughputs (kB/s)

IAHB 248057 8296.62 8244.11 DAHB 31485.1 22524.2 22491.1 Bus Utilization (%) IAHB 50.8023 1.69915 1.68839 DAHB 6.58656 4.64873 4.64154 Master Wait Total

(%)

IAHB 0.491968 0.0209044 0.0203279 DAHB 13.0944 2.50588 2.50387 AVG. Waiting Masters 0.135683 0.0252679 0.0252419

16QAM

Transaction Counts IAHB 568312 5908 5828

DAHB 72865 29201 28481

Transaction Throughputs (kB/s)

IAHB 249642 4547.9 4493.76

DAHB 31394.6 22385 21867

Bus Utilization (%) IAHB 51.1267 0.93141 0.920323

DAHB 6.5551 4.60361 4.49755

Master Wait Total (%)

IAHB 0.559385 0.0110357 0.0108961

DAHB 12.6499 2.2445 2.23385

AVG. Waiting Masters 0.132093 0.0225553 0.0224475

64QAM

Transaction Counts IAHB 840627 6105 6033

DAHB 106757 41333 41253

Transaction Throughputs (kB/s)

IAHB 249641 3109.74 3073.59 DAHB 31118.3 20992.1 20954.9 Bus Utilization (%) IAHB 51.1265 0.636874 0.62947

DAHB 6.4929 4.31186 4.30425

Master Wait Total (%)

IAHB 0.539773 0.00865847 0.00834703 DAHB 12.6321 2.05208 2.05107 AVG. Waiting Masters 0.131719 0.0206074 0.0205942

We combined all TX function blocks to one HW accelerator for simulation Case 3, so that only TX function remains in the application SW program.

Table 4.17: Function profiling result in Case 4 Modulation

mode

Function name

Total execution time (Cache size) Instruction counts 0k 4k 32k

QPSK TX 392512 175496 175400 8501

Modulation 242152 71400 71360 5376

16QAM TX 379472 176224 176120 8889

Modulation 226616 70928 70888 5716

64QAM TX 477608 217360 216808 10157

Modulation 323712 112368 111952 6912

Table 4.18: Memory accesses of functions in Case 4

Modulation mode Cache size Function name ROM RAM read read write

QPSK

0k TX 24560 1568 7

Modulation 24507 1562 3

4k TX 0 0 0

Modulation 22265 1586 3

4k TX 0 0 0

Modulation 34623 1610 3

4k TX 0 0 0

Modulation 0 0 0

32k TX 0 0 0

Modulation 0 0 0

Table 4.17 and Table 4.18 have one more function than Table 4.13 and Table 4.14, because simulation Case 4 retains the function Modulation from application SW code.

Table 4.19: Total memory and HW model accesses in Case 4

Modulation mode QPSK 16QAM 64QAM

Cache disabled

ROM access counts

Read 314888 589902 885465

Write 0 0 0

Total 314888 589902 885465 RAM access counts TX HW access counts

Read 0 0 0

Write 1536 1536 1536

Total 1536 1536 1536

The twenty tables in this section show the profiling result of simulations for the four simulation cases that we had defined in chapter 3, and three supported modulation modes run with three different configurations in cache size of ISS for each case. All of the results in these tables are simulated in a 125MHz system clock frequency; that means the system clock period is 8 ns. All TLM models for HW accelerators are seen as ideal HW without delay although we had defined delay parameter and ready signal in model template, and the bus transaction duration is also 8 ns.

Table 4.20: Bus transaction information in Case 4 Modulation

mode Information Type Master Cache size

0k 4k 32k

QPSK

Transaction Counts IAHB 330283 15784 15536

DAHB 43585 18152 18136

Transaction Throughputs (kB/s)

IAHB 246996 20644.5 20331.4 DAHB 31969.4 23559.2 23551.3 Bus Utilization (%) IAHB 50.585 4.22799 4.16386

DAHB 6.6753 4.86229 4.8607

Master Wait Total (%)

IAHB 0.609714 0.0195542 0.0179569 DAHB 13.6969 2.84366 2.84363 AVG. Waiting Masters 0.143066 0.0286321 0.0286158

16QAM

Transaction Counts IAHB 597722 15729 15656

DAHB 77453 31532 30860

Transaction Throughputs (kB/s)

IAHB 247778 11467.2 11430.9 DAHB 31522.1 22886.6 22429.9 Bus Utilization (%) IAHB 50.7451 2.34848 2.3412 DAHB 6.57555 4.70801 4.61451 Master Wait Total

(%)

IAHB 0.600564 0.0116461 0.0113643 DAHB 13.1682 2.40387 2.39458 AVG. Waiting Masters 0.137687 0.0241552 0.0240595

64QAM

Transaction Counts IAHB 885043 15990 15926

DAHB 111323 43697 43633

Transaction Throughputs (kB/s)

IAHB 250197 7815.68 7786.46 DAHB 30908.6 21290.3 21264.6 Bus Utilization (%) IAHB 51.2404 1.60065 1.59467 DAHB 6.44514 4.37421 4.36896 Master Wait Total

(%)

IAHB 0.551284 0.00830857 0.00821064 DAHB 12.9417 2.17805 2.17792 AVG. Waiting Masters 0.13493 0.0218636 0.0218613

相關文件