Recent studies have attempted to reduce leakage power using integrated archi-tecture and compiler power-gating mechanisms [Dropsho et al. 2002; Rele et al.
2002; You et al. 2006, 2002; Zhang et al. 2004, 2003]. Dropsho et al. [2002] pro-posed an analytical energy model for architecture-level analysis, and described the benefits of employing a dual-threshold-voltage technique to reduce sub-threshold leakage current in the integer functional units of a processor. They also proposed a simple architecture design, called gradual sleep, to reduce the overhead of activating the sleep mode for smaller idle periods. The work of Rele et al. [2002] is based on a profiling approach to identify those blocks in which functional units are expected to be idle (based on the execution frequencies of each basic block), and then inserting off and on instructions at entry and exit
points of such blocks, respectively. You et al. [2002] proposed a more formal compiler methodology that uses a data-flow analysis approach to collect the information of activities of each functional unit at each point of a program, in-serting power-gating instructions by using a scheduling algorithm to deal with the uncertainty of idle periods due to conditional branches. They also proposed an architecture to make power-gating controls applicable to out-of-order issue processors [You et al. 2006]. Aside from controlling leakage energy of functional units, Zhang et al. [2004] presented a compiler-directed approach that inserts power mode instructions for cache lines to control leakage energy consumed in the instruction cache.
The previously described approaches have shown that leakage power can be effectively suppressed with help from compilers. However, there are concerns about the amount of power-control instructions being added to programs as increasing numbers of components are equipped with power-gating controls in SoC design platforms. Whilst power-gating instructions can significantly reduce leakage power, they produce recovery penalties and increase the execution time and code size of programs. Our sink-n-hoist framework for a compiler solution attempts to merge several power-gating instructions into a single compound instruction so as to reduce the amount of power-gating instructions.
7. CONCLUSION
In summary, our experiments have demonstrated that the sink-n-hoist analysis framework proposed in this article improves code size, energy consumption, and performance. It reduces the overall energy consumption and code size growth by an average of about 0.9% and 47.8% , respectively, compared with the CADFA scheme without our sink-n-hoist approach, and impacts performance by an av-erage of less than 1%. As the compiler phase is done one phase after another, our framework provides a sound theoretical foundation capable of working with other improvements, such as adding more slackness for low power. We are cur-rently in the process of incorporating more components (such as cryptography modules) into our architecture and simulator. We expect that our scheme will be even more beneficial as more extensible modules are equipped with power-gating controls in SoC design platforms.
ACKNOWLEDGMENTS
The work was supported in part by the National Science Council (under grant numbers. NSC 95-2220-E-007-001 and NSC 95-2220-E-007-002), the Ministry of Economic Affairs (under grant numbers 95-EC-17-A-01-S1-034 and 96-EC-17-A-01-S1-034), and ITRI (under an ITRI/NTHU research grant). We are also grateful to the National Center for High-performance Computing for computer time and facilities.
REFERENCES
BELLAS, N., HAJJ, I. N.,ANDPOLYCHRONOPOULOS, C. D. 2000. Architectural and compiler techniques for energy reduction in high-performance microprocessors. IEEE Trans. on Very Large Scale Integr. Syst. 8, 3 (Jun.), 317–326.
BROOKS, D., TIWARI, V., ANDMARTONOSI, M. 2000. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the International Symposium on Computer Architecture (Vancouver, Canada), 83–94.
BUTTS, J. A.ANDSOHI, G. S. 2000. A static power model for architects. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (Monterey, CA), 191–201.
CHANDRAKASAN, A. P., SHENG, S.,ANDBRODERSEN, R. W. 1992. Low-Power CMOS digital design.
IEEE J. Solid-State Circ. 27, 4, 473–484.
CHANG, J.-M.ANDPEDRAM, M. 1995. Register allocation and binding for low power. In Proceedings of the Design Automaton Conference (San Francisco, CA), 29–35.
COMPAQCOMPUTERCORP. 1999. Alpha 21264 Microprocessor Hardware Reference Manual.
DOYLE, B., ARGHAVANI, R., BARLAGE, D., DATTA, S., DOCZY, M., KAVALIEROS, J., MURTHY, A.,ANDCHAU, R.
2002. Transistor elements for 30 nm physical gate lengths and beyond. Intel Technol. J. 6, 2 (May), 42–54.
DROPSHO, S., KURSUN, V., ALBONESI, D. H., DWARKADAS, S.,ANDFRIEDMAN, E. G. 2002. Managing static leakage energy in microprocessor functional units. In Proceedings of the 35th International Symposium on Microarchitecture (MICRO) (Istanbul, Turkey), 321–332.
FEREMANS, C., LABB´e, M.,ANDLAPORTE, G. 2003. Generalized network design problems. Eur. J.
Oper. Res. 148, 1–13.
GONZALEZ, R. E. 2000. Xtensa: A configurable and extensible processor. IEEE Micro. 20, 2, 60–70.
HOROWITZ, M., INDERMAUR, T.,ANDGONZALEZ, R. 1994. Low-Power digital design. In Proceedings of the IEEE Symposium on Low Power Electronics (San Diego, CA), 8–11.
HU, Z., BUYUKTOSUNOGLU, A., SRINIVASAN, V., ZYUBAN, V., JACOBSON, H.,AND BOSE, P. 2004. Mi-croarchitectural techniques for power gating of execution units. In Proceedings of the Interna-tional Symposium on Low Power Electronics and Design (ISLPED) (Newport Beach, CA), 32–
37.
IP, H., LOW, J., CHEUNG, P. Y. K., CONSTANTINIDES, G. A., LUK, W., SENG, S. P.,ANDMETZGEN, P. 2002.
Strassen’s matrix multiplication for customisable processors. In Proceedings of the IEEE Inter-national Conference on Field-Programmable Technology (FPT) (Hong Kong), 453–456.
JONES, R. 2004. Modeling and design techniques reduce 90 nm power. EE Times. http://www.
eetimes.com/showArticle.jhtml?articleID=26806450.
KAO, J. T.ANDCHANDRAKASAN, A. P. 2000. Dual-Threshold voltage techniques for low-power digital circuits. IEEE J. Solid-State Circ. 35, 7, 1009–1018.
KARNIK, T., BORKAR, S.,ANDDE, V. 2002. Sub-90nm technologies—Challenges and opportunities for CAD. In Proceedings of the International Conference on Computer-Aided Design (ICCAD) (San Jose, CA), 203–206.
KIM, N. S., AUSTIN, T., BLAAUW, D., MUDGE, T., FLAUTNER, K., HU, J. S., IRWIN, M. J., KANDEMIR, M.,AND
NARAYANAN, V. 2003. Leakage current: Moore’s law meets static power. IEEE Comput. 36, 12, 68–75.
KOSTER, A. M., VANHOESEL, S. P.,ANDKOLEN, A. W. 1998. The partial constraint satisfaction problem: Facets and lifting theorems. Oper. Res. Lett. 23, 89–97.
LEE, C., LEE, J. K., HWANG, T.-T.,ANDTSAI, S.-C. 2003. Compiler optimizations on VLIW instruction scheduling for low power. ACM Trans. Des. Autom. Electron. Syst. 8, 2, 252–268.
LEE, M. T.-C., TIWARI, V., MALIK, S.,ANDFUJITA, M. 1997. Power analysis and minimization tech-niques for embedded DSP software. IEEE Trans. Very Large Scale Integr. Syst. 5, 1 (Mar.), 123–
133.
RELE, S., PANDE, S., ONDER, S.,ANDGUPTA, R. 2002. Optimizing static power dissipation by func-tional units in superscalar processors. In Proceedings of the 11th Internafunc-tional Conference on Compiler Construction (CC) (Grenoble, France), 261–275.
ROY, K.ANDPRASAD, S. C. 1992. SYCLOP: Synthesis of CMOS logic for low power applications. In Proceedings of the IEEE International Conference on Computer Design (Cambridge, MA), 464–
467.
SEMICONDUCTORINDUSTRYASSOC. 2004. International technology roadmap for semiconductors.
SMITH, M. D. 1998. The SUIF Machine Library. Division of of Engineering and Applied Science, Harvard University.
STANFORD COMPILER GROUP. 1995. The SUIF Library. Stanford Compiler Group, Stanford University.
SU, C.-L.ANDDESPAIN, A. M. 1995. Cache designs for energy efficiency. In Proceedings of the 28th Annual Hawaii International Conference on System Sciences (Los Angeles, CA), 306–315.
TIWARI, V., SINGH, D., RAJGOPAL, S., MEHTA, G., PATEL, R.,ANDBAEZ, F. 1998. Reducing power in high-performance microprocessors. In Proceedings of the Design Automaton Conference (San Francisco, CA), 732–737.
TIWARI, V., DONNELLY, R., MALIK, S., ANDGONZALEZ, R. 1997. Dynamic power management for microprocessors: A case study. In Proceedings of the International Conference on VLSI Design (Hyderabad, India), 185–192.
TSUTSUI, H., MASUZAKI, T., IZUMI, T., ONOYE, T.,ANDNAKAMURA, Y. 2002. High speed JPEG2000 encoder by configurable processor. In Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) (Singapore), 45–50.
YANG, H., GOVINDARAJAN, R., GAO, G. R., CAI, G.,ANDHU, Z. 2002. Exploiting schedule slacks for rate-optimal power-minimum software pipelining. In Proceedings of the 3rd Workshop on Compilers and Operating Systems for Low Power (COLP) (Charlottesville, VA).
YOU, Y.-P., LEE, C.,ANDLEE, J. K. 2006. Compilers for leakage power reduction. ACM Trans. Des.
Autom. of Electron. Syst. 11, 1 (Jan.), 147–164.
YOU, Y.-P., LEE, C.,ANDLEE, J. K. 2002. Compiler analysis and supports for leakage power re-duction on microprocessors. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC) (Washington, DC), 63–73. Lecture Notes in Computer Science, vol. 2481, Springer.
ZHANG, W., HU, J. S., DEGALAHAL, V., KANDEMIR, M., VIJAYKRISHNAN, N.,ANDIRWIN, M. J. 2004. Reduc-ing instruction cache energy consumption usReduc-ing a compiler-based strategy. ACM Trans. Architect.
Code Optimi. 1, 1 (Mar.), 3–33.
ZHANG, W., KANDEMIR, M. T., VIJAYKRISHNAN, N., IRWIN, M. J.,ANDDE, V. 2003. Compiler support for reducing leakage energy consumption. In Proceedings of the 6th Design Automation and Test in Europe Conference (DATE) (Messe Munich, Germany), 1146–1147.
ZIVOJNOVIC, V., MARTINEZ, J., SCHLAGER, C.,ANDMEYR, H. 1994. DSPstone: A DSP-Oriented bench-marking methodology. In Proceedings of the International Conference on Signal Processing and Technology (ICSPAT) (Dallas, TX), 715–720.
Received October 2006; revised May 2007; accepted May 2007