This project proposes and implements a high-level energy profiling tool called SEProf. SEProf estimates the energy consumption of each thread by maintaining a power table stack for each thread and tracking the power configurations of embedded processors in runtime. This makes SEProf suitable for energy estimation on multi-core embedded systems with power management functions. The experiments in this report implemented SEProf in Linux kernel 2.6.19, and conducted a number of experiments on an ARM11 MPCore processor. VFS results show that the average power estimation error using SEProf was within 2% and the standard deviation of the estimation error was within 2%. DVS results indicate that the average power estimation error was within 4%, and the standard deviation of the estimation error was within 5% when the DVS interval was 100 ms. The performance overhead introduced by SEProf in DVS experiment was less than 1%.
References
[1] K. Choi, R. Soma, and M. Pedram, "Fine-Grained Dynamic Voltage and Frequency Scaling for Precise Energy and Performance Tradeoff Based on the Ratio of Off-Chip Access to On-Chip Computation Times," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 24, NO.
1, January 2005.
[2] C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi, "An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget, " the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2006.
[3] J. Flinn and M. Satyanarayanan, "PowerScope: A Tool for Profiling the Energy Usage of Mobile Applications," in Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications (WMCSA), 1999.
[4] D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A Framework for Architectural-Level Power Analysis and Optimizations," 27th International Symposium on Computer Architecture (ISCA-27), June 2000.
[5] M. Monchiero, R. Canal, and A. Gonzalez, "Power/Performance/Thermal Design-Space Exploration for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems, Vol. 19, No. 5, May 2008.
[6] V. Tiwari, S. Malik, and A. Wolfe, "Power analysis of embedded software: A first step towards software power minimization," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 2, Issue 4, pp. 437-445, Dec. 1994.
[7] A. Sinha and A. P. Chandrakasan, "Jouletrack - a web based tool for software energy profiling," in Proceedings of the Design Automation Conference (DAC), 2001.
[8] H. Blume, D. Becker, L. Rotenberg, M. Botteck, J. Brakensiek, and T.G. Noll, "Hybrid functional- and instruction-level power modeling for embedded and heterogeneous processor architectures," Journal of Systems Architecture, Vol. 53, Issue 10, pp. 689–702, 2007.
[9] H. Blume, J.v. Livonius, L. Rotenberg, T.G. Noll, H. Bothe, and J. Brakensiek, "Performance and Power Analysis of Parallelized Implementations on an MPCore Multiprocessor Platform," International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (IC-SAMOS), 2007.
[10] T. K. Tan, A. Raghunathan, and N. K. Jha, "EMSIM: An energy simulation framework for an embedded operating system," in Proceedings of IEEE International Symposium on Circuit & Systems, pages 464–467, May 2002.
[11] T. K. Tan, A. Raghunathan, and N. K. Jha, "A simulation framework for energy consumption analysis of OS-driven embedded applications," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 22(9) 1284-1294, Sept. 2003.
[12] T. K. Tan, A. Raghunathan, G. Lakshminarayana, and N. K. Jha. "High-level software energy macro-modeling," in Proceedings of Design Automation Conference, June 2001.
[13] G. Qu, N. Kawabe, K. Usami, and M. Potkonjak, “Function-level power estimation methodology for microprocessors,”in Proceedings of Design Automation Conference (DAC), pp. 810–813, 2000.
[14] C.-H. Hsu, J.-J. Chen, and S.-L. Tsao, "Evaluation and Modeling of Power Consumption of a Heterogeneous Dual-Core Processor," in the 13th International Conference on Parallel and Distributed Systems (ICPADS), Hsinchu, Taiwan, Dec. 2007.
[15] "ARM11 MPCore Processor Revision r1p0 Technical Reference Manual," ARM, Feb. 2008.
[16] "Core Tile for ARM11 MPCore HBI-0146 User Guide," ARM, September 2006.
[17] "RealView™ Emulation Baseboard HBI-0140 Rev D User Guide," ARM, Oct. 2007.
[18] R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen, "Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction," in Proceedings of the 36th International Symposium on Microarchitecture (MICRO), Dec. 2003.
[19] J. Levon, "OProfile Internals," http://oprofile.sourceforge.net/doc/internals/index.html, 2003.
[20] H. Jin, M. Frumkin, and J. Yan, "The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance," NAS Technical Report NAS-99-011, NASA Ames Research Center, Oct. 1999.
[21] D. Burger and T. M. Austin, "The SimpleScalar Tool Set," Version 2.0, Computer Architecture News, pp.
13-25, Jun. 1997.
[22] REALProf, http://of.openfoundry.org/projects/1399
[23] Jian-Jhen Chen, Shiao-Li Tsao, and Meng-Ru Lin, “A High-Level Software Energy Profiling Tool for Embedded Processors,” The Third Asia Pacific Embedded Systems Education and Research Conference, Singapore, Dec 2009.
[24] Jian-Jhen Chen, Shiao-Li Tsao, and Meng-Ru Lin, “SEProf: A High-Level Software Energy Profiling Tool for an Embedded System with Dynamic Power Management Functions,” in preparation, 2009.
[25] Jyun-Wei Lin and Shiao-Li Tsao, “Hardware-Assisted Performance/Energy Evaluation Tool for Multi-core Embedded System,” in preparation, 2009.
Project Self-Assessment
We have released the beta version of the REALProf tool on Open Foundry [22]. Also, we have one conference paper which has been accepted [23] and two journal papers [24][25] which are under preparation based on the results of this project.
國科會補助專題研究計畫項下赴國外(或大陸地區)出差或
system project in Prof. Jürg Gutknecht's group. The project goals are to propose and develop system programming models and languages for data streaming applications based on reconfigurable embedded multi-core systems. Prof. Jürg Gutknecht and his group proposed a language called System Oberon to facilitate designers in developing embedded systems including hardware and software. Designers first describe their systems using System Oberon language, System Oberon compiler can generate計畫編
efficient multi-core hardware for running the application, and multi-thread software running on top of the multi-core processor. The hardware descriptions and configurations generated by System Oberon compiler are further processed by the hardware synthesis tool, e.g. Xilinx ISE. The hardware synthesis tool can generate the hardware on an FPGA board. Figure 1 shows the design flow of System Oberon and FPGA-based embedded multi-core system. Electrocardiography (ECG) was used as an example to verify the concept and design flow of System Oberon.
Figure 1. System Oberon and its design flow for FPGA-based embedded multi-core (Figure source: Lisa Liu and Oleksii Morozov, “A Process-Oriented Streaming System
Design Paradigm for FPGA,” submitted for publication).
The power consumption is an important issue for an embedded multi-core system.
One of the benefits for using System Oberon is because the compiler can produce efficient hardware for running the application program. For example, the interconnection bus between CPU cores, and shared memory buffer between CPU cores are optimized by System Oberon. However, current System Oberon can generate the hardware without power management functions, and the CPU cores always operate at the maximal speed. In the multi-core system, CPU cores execute processes in parallel and sometime they communicate with each other for coordinating the tasks and data. Without precisely managing the tasks on each CPU core, the communication between CPU cores, the operating speeds and voltages of CPU cores, the CPU cores may consume extra power during the execution. Figure 2 (a) illustrates an example that the system wastes power if all CPU cores operate at the maximal and the same speed. Therefore, during this visit, I worked with Prof. Jürg
Gutknecht and his group to improve the energy efficiency of the system generated by System Oberon language. In the enhanced System Oberon, designers could specify the power consumption requirements when they develop their systems. The compiler can generate the necessary hardware to optimize the energy efficiency of the embedded multi-core system. For example, the designer provides the energy requirement information in System Oberon program. Then, the compiler generates the hardware with power management functions, and software utilizing these hardware power management features to minimize the power consumption of the system.
Figure 2 (b) gives an example.
(a) (b)
Figure 2. Reducing power consumption of an embedded multi-core system through hardware and software power management functions.
time CPU1
CPU2
CPU3
CPU4
CPU5
Tasks Power consumption 2ms
CPU1
CPU2
CPU3
CPU4
CPU5
Different voltages/clock rates/power modes/power domains/types of processors
Figure 3. Low-power hardware extension based on System Oberon.
Figure 3 and Figure 4 illustrate our proposed low-power enhancements based on System Oberon. Figure 3 shows that a designer can specify the performance or energy requirement of the systems in System Oberon language. After the designer provides the information, System Oberon compiler generates the necessary low-power hardware and/or additional power management hardware into the hardware platform.
For example, in Figure 3, the designer added the low-power attribute in the system description. The compiler automatically integrates the power management hardware components into the original hardware designs. Then, the designer is able to use power management functions when they develop their energy-aware applications based on System Oberon. Figure 4 gives an example that designer can use power management functions such as sleep and idle in their programs. Moreover, the busy-waiting functions are automatically translated to the sleep-wakeup version which consumes much less power when the designer specifies the low-power attribute in the System Oberon program.
Low power attributes and performance/power requirements
Pow er m an age r
clk
clk gclk
clken
FIFO FIFO
T h rou gh ou tbu s /regi ster
FIFO/IO status
BUFGCE
Figure 4. Low-power software extension based on System Oberon.
During the three-month visit, we not only enhanced System Oberon to support energy-aware program development, but also prototyped the proposed concepts in System Oberon language, System Oberon compiler, and the CPU hardware called TRM (Tiny Register Machine). We used the ECG application to verify our designs and compare the performance in terms of energy consumption before and after applying our proposed energy-aware features. To evaluate the power consumption, we established a power consumption evaluation platform shown in Figure 5. We used the platform to measure and evaluate the power consumption of the ECG multi-core embedded system. Figure 6 shows the experimental results. Compared with the full-connected multi-core embedded system, the system generated by System Oberon significantly reduces the power consumption because the unnecessary interconnection bus and buffer are avoided. The ECG multi-core embedded system generated by our energy-aware System Oberon can further reduce 45% power consumption.
BEGIN
REPEAT UNTIL checkreceive(in, a) Or timerexpire()
IF timerexpire() THEN
Figure 5. Architecture and demonstration of the power consumption evaluation platform.
Figure 6. Performance evaluation of the proposed low-power improvement.
The system level design environment is very important to embedded system development. System Oberon provides a convenient language for designers to describe their system architecture, hardware configurations, software functions and procedures. System Oberon compiler can produce efficient hardware and software for the application. Our low-power and power management enhancement based on System Oberon further provide language level support for designers to specify their energy and performance requirements. Therefore, the development of energy-aware
Power supply
ML505 board Agilent U1253B
DC power connector +O
)-- + +
-AC power
PC IR-USB
Low power version
ECG: process 500 samples per second
0.54502
XPower simulation
Estimated results based on physical measurement
45%
system and software become possible based on System Oberon design flow. Our prototype and preliminary experimental results demonstrate that the proposed enhancement can significantly reduce the power consumption of the ECG application by 45%.
The colleagues in Prof. Jürg Gutknecht’s group are further improving the system and collecting more experimental results. We are working together in preparing a joint research paper which will be submitted to a conference or journal.
三、 建議與後續合作
Figure 7. Prof. Jürg Gutknecht, his group and me.
This is a very successful visit. My research on energy-aware computing is a system-wide research topic, and requires the knowledge and support across a wide spectrum of computer systems. Due to limit research resources I had in Taiwan, I usually have to focus on a specific point in a system and fail to evaluate the design from a system perspective. Also, I usually have to consider the design and evaluation of the proposed ideas and technologies into separated hardware and software platforms. On the other hand, Prof. Jürg Gutknecht and his group have built entire embedded multi-core systems including hardware, system software, and application software by themselves. Therefore, we could be able to research our low-power designs from a system point of view, and realize and evaluate hardware and software designs in the real system. I do learn a lot from cooperating with Prof. Jürg Gutknecht and his group members.
Prof. Jürg Gutknecht and his group have worked on programming languages, compilers, run-time systems, operating systems, and hardware designs for years, and have outstanding achievements. They want to consider power consumption issues in their research and therefore invite me for a visit. Power consumption is regarded as a critical problem of information and communication technology (ICT) infrastructure in the next decade. Reducing the electrical needs of computer systems becomes the most important task for computer scientists and engineers, and has recently attracted
considerable interest in both academia and industry. During this visit, I also do bring new ideas, give talks, and share my experiences and research results in reducing the power consumption of computer systems to the group. We not only jointly proposed the new low-power ideas for the existing projects, but also prototyped and evaluated the proposed systems in a real environment. The experimental results demonstrate we could significantly reduce the power consumption of the systems. Moreover, I help the group to establish the power consumption evaluation platform so that they can continue the low-power research and development after I am back to Taiwan.
I was invited to join the annual retreat of Prof. Jürg Gutknecht's group in Sept. 24 to Sept. 25, 2010. During the meetings, we all agree that we had very successful cooperation and had fruitful research results in this visit. This visit has mutual benefits to both of our groups. We also concluded we should continue our cooperation and further extend the cooperation between two groups. Research cooperation has been arranged and they are currently on-going. They are:
(1) FP7 project cooperation: Prof. Jürg Gutknecht has involved in an FP7 project,
called Online Predictive Tools for Intervention in Mental Illness (OPTIMI). His group is to build a wearable device for monitoring patients’ physiology data. The wearable device is now operated by batteries but will be operated by harvested energy sources such as solar energy, thermal energy, kinetic energy, etc. The power consumption issue is one of the most challenging issues for the device design. My team has worked on power management middleware and run-time support for energy-aware software which is a critical software component for the FP7 OPTIMI project. Therefore, Prof.Jürg Gutknecht and FP7 OPTIMI project coordinator invited me to join the project so that we could contribute our power management middleware to the project. (see below letters of intention)
Figure 8. Letters of intention for joining FP7 OPTIMI project.
Based on the discussion, I then proposed a project under join research projects agreement between Switzerland/SNSF and Taiwan /NSC scientific cooperation. The project is just approved and granted by National Science Council in Taiwan. Based on this joint project, we will continue our cooperation effort and further extend our cooperation in area of power management framework for wearable devices and energy scavenging sensors.
(2) New FP7 project on green datacenters: Prof. Jürg Gutknecht invited me to
visit Microsoft Research Cambridge in UK in Sept. 28, 2010. During the visit, we had an intensive discussion on the area of power management issues in datacenters, and found a very good synergy between Prof. Jürg Gutknecht's group in ETH Zurich, Microsoft Research Cambridge and my group in National Chiao Tung University in Taiwan. We thus decided to work on a new proposal under FP7. The draft project title is SPREAD: Scalable Predictably Robust Energy Aware Datacenter. The overall objective is to define, develop and evaluate and model a scalable methodology for the on-going deployment of energy aware datacenters.We are currently preparing the proposal and plan to submit the proposal next year.
國科會補助計畫衍生研發成果推廣資料表
日期:2011/01/03
國科會補助計畫
計畫名稱: 子計畫三:多核心嵌入式系統效能與耗能分析監測與改善(2/2) 計畫主持人: 曹孝櫟
計畫編號: 98-2220-E-009-013- 學門領域: 晶片科技計畫--整合型學術研究 計畫
無研發成果推廣資料
98 年度專題研究計畫研究成果彙整表
研討會論文 1 1 50%
國科會補助專題研究計畫成果報告自評表
請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價 值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)、是否適 合在學術期刊發表或申請專利、主要發現或其他有關價值等,作一綜合評估。
1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估
■達成目標
□未達成目標(請說明,以 100 字為限)
□實驗失敗
□因故實驗中斷
□其他原因 說明:
2. 研究成果在學術期刊發表或申請專利等情形:
論文:■已發表 □未發表之文稿 □撰寫中 □無 專利:□已獲得 □申請中 ■無
技轉:□已技轉 □洽談中 ■無 其他:(以 100 字為限)
3. 請依學術成就、技術創新、社會影響等方面,評估研究成果之學術或應用價 值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)(以 500 字為限)
在本計畫中,我們提出並實作完成了一系列嵌入式多核心系統之耗電評估工具,包含 SEProf 一個以耗電模型設計之軟體耗電評估工具,另一個工具為 REALProf:是一種以硬體 協助且適用於多核心嵌入式系統的效能與耗能評估工具。所提之工具提供程式執行時期硬 體事件的監控,並藉此推算出元件耗能,其可避免軟體取樣(Sampling)所造成的額外負 擔,以便能呈現系統原始的行為與特性。實驗結果顯示,所提之方法可於 100 MHz 的四核 心(LEON 3)仿真環境下進行,精度可達微秒以下,其兼具快速、精細且真實的特性將有助 於複雜多核心嵌入式系統設計時期之細部評估與分析。此方面的研究成果已經透過 Open Foundry 釋出本計畫之成果供學術界與業界使用,同時相關研究成果,已有多篇論文之發 表與投稿。