• 沒有找到結果。

4.  Case Study: ARM11 MPCore Processor

4.2.  Experimental Results

4.2.2.  DVS Experiment

cg.W 176,394 -0.17% 1.15%

ft.W 49,043 0.21% 2.44%

is.W 24,308 -0.26% 0.96%

FileRW 14,004 1.08% 1.67%

Overall 291,048 -0.07% 1.79%

5

cg.W 167,235 -0.11% 1.11%

ft.W 43,609 -0.10% 2.36%

is.W 22,380 -0.26% 0.99%

FileRW 13,467 0.73% 1.58%

Overall 272,056 -0.10% 1.77%

4.2.2. DVS Experiment

The primary difference between SEProf and the existing high-level modeling-based software energy estimation tools is that SEProf is aware of the changes in the power levels of embedded processors at runtime. This feature is examined in this section. As in VFS experiment, we selected five power levels for the ARM11 MPCore processor, but in DVS experiment the clock frequency of the processor operating at each power level is the same as shown in Table 5, since we did not successfully scale the frequency of the processor without resetting it. Nevertheless, it does not prevent us from examining that SEProf supports the above mentioned feature, because the power consumption of the processor is also dynamically changed by scaling the voltage of the processor at runtime. In DVS experiment, as in VFS experiment, only one MP11 CPU was active, and seven power tables were built for the six applications and the Linux kernel as shown in Table 6.

Table 5. Power levels of the ARM11 MPCore processor used in DVS experiment Power Level Voltage (V) Frequency (MHz)

1 0.95 140

2 1.01 140

3 1.08 140

4 1.14 140

5 1.2 140

Table 6. Pre-built power tables used in DVS experiment Power

Level

Average Power (uW)

busybox cg.W ft.W is.W FileRW oprofiled vmlinux

1 259,910 248,793 257,860 245,596 245,596 263,693 236,264 2 297,920 283,671 294,975 281,724 281,911 301,850 271,293 3 338,951 321,917 335,648 321,153 329,959 343,958 308,797 4 381,720 362,457 378,441 363,074 370,296 387,785 349,494 5 426,690 402,961 422,059 405,925 407,946 432,234 389,445

In DVS experiment, the voltage of the processor was periodically scaled at three different time intervals, 100 ms, 1 s, and 10 s. At each time interval, the power level of the processor was increased by one. If the power level of the processor reached to five, then it was set to one at the next time interval. An example of DVS experiment is depicted in Figure 6. In the figure, two lines show the measured and the estimated power consumption of the processor sampled by the patched OProfile during the execution of the IS application. The DVS interval was set to 100 ms in this example, therefore the power consumption of the processor was varied every 100 ms. It can be seen in Figure 6 that the estimated power consumption is very close to the measured one. However, sometimes the line of the estimated power consumption is dropped but that of the measured one is not. This is because that the thread which executed the IS application was scheduled out during that period, and another thread which used a different power table was scheduled in. If the newly scheduled thread had lower average power consumption, then a drop will be displayed in the figure. On the other hand, for the line of the measured power consumption, since the time interval that the power consumption read from ADC is updated around every 5 ms, the power drop will not be shown in Figure 6 if the newly scheduled thread is scheduled out immediately within the update period of the ADC.

Figure 6. The measured and the estimated power consumption during the execution of is.W

The power estimation error in DVS experiment is shown in Table 7. It can be seen that the average estimation error is still within 2%. However, the standard deviation of the estimation error becomes larger with the decreasing DVS interval. It is because the power consumption of the processor is not changed immediately after a new value is written to the DAC, and the changed power consumption of the processor is not able to be read from the ADC immediately. We explain this in Figure 7 which draws seven power samples taken from the ADC during the period that the voltage level of the processor is scaling. An arrow in Figure 7 indicates the time that the new voltage level is written to the DAC. SEProf updates the power level of the processor at this point. Nevertheless, the power consumption of the processor does not be changed immediately. Instead, it becomes stable and able to be read form ADC in the next 10 ms. Consequently, the power consumption difference between the measured and the estimated ones during this period enlarges the standard deviation of the estimation error.

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 7500 8000

Power (uW)

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 7500 8000

Power (uW)

Time (ms) Estima ted Power

Table 7. Power estimation error in DVS experiment

cg.W 227,888 1.33% 4.61%

ft.W 71,085 1.04% 4.94%

is.W 34,674 0.92% 4.87%

FileRW 16,743 1.87% 5.10%

Overall 394,869 1.25% 4.86%

1 s

cg.W 228,028 0.16% 1.94%

ft.W 70,887 0.09% 2.28%

is.W 34,688 0.06% 1.79%

FileRW 17,027 0.82% 2.79%

Overall 393,616 0.17% 2.21%

10 s

cg.W 228,118 -0.18% 1.35%

ft.W 70,943 -0.08% 1.85%

is.W 34,986 -0.26% 1.21%

FileRW 16,767 0.88% 1.81%

Overall 393,227 -0.13% 1.68%

Figure 7. Power samples during DVS

In the last experiment, we measured the performance overhead introduced by using SEProf in DVS experiment. As shown in Table 8, two types of SEProf were implemented and evaluated. The first type of SEProf is called “SEProf - Time Only”. It only profiled time information of all threads at runtime, and was used to examine the performance overhead caused by profiling time information. The second type of SEprof is called “SEProf” in Table 8.

It profiled both time and energy of all threads at runtime. Table 8 listed the length of the two time periods that the testing programs ran under the Linux kernels patched by the two types of

340,000

0 5,000 10,000 15,000 20,000 25,000 30,000

Power (uW)

Time (us)

SEProf. These two time periods were normalized to the length of the time period that the testing programs ran under an unmodified Linux kernel. In Table 8, it can be seen that the overhead of using SEProf is less than 1% which is quite small even when the DVS interval is 100 ms.

Table 8. Performance overhead of using SEProf in DVS experiment DVS Interval Application

Name / Overall

SEProf –

Time Only SEProf

100 ms

cg.W 0.33% 0.22%

ft.W -0.96% -0.15%

is.W 0.15% 0.98%

FileRW 0.68% 0.54%

Overall 0.10% 0.26%

1 s

cg.W 0.09% 0.29%

ft.W -0.11% 0.04%

is.W -0.07% -0.04%

FileRW -1.02% -1.26%

Overall -0.05% 0.08%

10 s

cg.W -0.85% -1.01%

ft.W 0.92% -0.14%

is.W -2.34% -0.32%

FileRW -0.35% -0.39%

Overall -0.62% -0.71%

相關文件