Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Compiler optimization on VLIW instruction scheduling for low power

Published: 01 April 2003 Publication History

Abstract

In this article, we investigate compiler transformation techniques regarding the problem of scheduling VLIW instructions aimed at reducing power consumption of VLIW architectures in the instruction bus. The problem can be categorized into two types: horizontal scheduling and vertical scheduling. For the case of horizontal scheduling, we propose a bipartite-matching scheme for instruction scheduling. We prove that our greedy bipartite-matching scheme always gives the optimal switching activities of the instruction bus for given VLIW instruction scheduling policies. For the case of vertical scheduling, we prove that the problem is NP-hard, and we further propose a heuristic algorithm to solve the problem. Our experiment is performed on Alpha-based VLIW architectures and an ATOM simulator, and the compiler incorporated in our proposed schemes is implemented based on SUIF and MachSUIF. Experimental results of horizontal scheduling optimization show an average 13.30% reduction with four-way issue architecture and an average 20.15% reduction with eight-way issue architecture for transitional activities of the instruction bus as compared with conventional list scheduling for an extensive set of benchmarks. The additional reduction for transitional activities of the instruction bus from horizontal to vertical scheduling with window size four is around 4.57 to 10.42%, and the average is 7.66%. Similarly, the additional reduction with window size eight is from 6.99 to 15.25%, and the average is 10.55%.

References

[1]
Aburto, A., Sill, D., and Thompson, D. 1997. comp.benchmarks FAQ. Computer Sciences Department, University of Wisconsin, http://www.cs.wisc.edu/ thomas/comp.benchmarks.FAQ.html.
[2]
Alidina, M., Monteiro, J., Devadas, S., Ghosh, A., and Papaefthymiou, M. 1994. Precomputation-based sequential logic optimization for low power. In IEEE Trans. VLSI Systems 2, 4 (Dec.), 426--436.
[3]
Bellas, N., Hajj, I. N., Polychronopoulos, C. D., and Stamoulis, G. 2000. Architectural and compiler techniques for energy reduction in high-performance microprocessors. IEEE Trans. VLSI Syst. 8, 3 (June), 317--326.
[4]
Benini, L. and De Micheli, G. 1995. State assignment for low power dissipation. IEEE J. Solid-State Circ. 30, 3 (March), 258--268.
[5]
Bernstein, D. and Rodeh, M. 1991. Global instruction scheduling for superscalar machines. In Proceedings of Conference on Programming Language Design and Implementation. ACM Press, Toronto, Ontario, Canada, 11--22.
[6]
Chandrakasan, A. P., Sheng, S., and Brodersen, R. W. 1992. Low-power cmos digital design. IEEE Journal of Solid-State Circuits 27, 4 (Apr.), 473--484.
[7]
Chang, J.-M. and Pedram, M. 1995. Register allocation and binding for low power. In Proceedings of the Design Automation Conference. ACM Press, San Francisco, California, United States, 29--35.
[8]
Digital Equipment Corporation 1994. ATOM User Manual. Digital Equipment Corporation, Maynard, Massachusetts, United States.
[9]
Fisher, J. A. 1983. Very long instruction word architectures and the eli-512. In Proceedings of the International Symposium on Computer Architecture. IEEE Computer Society Press, Stockholm, Sweden, 140--150.
[10]
Fisher, J. A., Ellis, J., Ruttenberg, J., and Nicolau, A. 1984. Parallel processing: A smart compiler and a dumb machine. In Proceedings of the Symposium on Compiler Construction. ACM Press, Montreal, Canada, 37--47.
[11]
Fredman, M. L. and Tarjan, R. E. 1987. Fibonacci heap and their uses in improved network optimization algorithms. Journal of the Association for Computing Machinery 34, 3 (July), 596--615.
[12]
Hachtel, G. D., Hermida, M., Pardo, A., Poncino, M., and Somenzi, F. 1994. Re-encoding sequential circuits to reduce power dissipation. In Proceedings of International Conference on Computer-Aided Design. IEEE Computer Society Press, San Jose, California, United States, 70--73.
[13]
Hennessy, J. L. and Patterson, D. A. 1996. Computer Architecture - A Quantitative Approach, 2nd Edition. Morgan Kaufmann Publishers, Inc, 340 Pine Street, 6th Floor San Francisco, California 94104, United States.
[14]
Hong, I., Kirovski, D., Qu, G., Potkonjak, M., and Srivastava, M. B. 1999. Power optimization of variable-voltage core-based systems. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems 18, 12 (Dec.), 1702--1714.
[15]
Irwin, M. J. 1999. Tutorial : Power reduction techniques in soc bus interconnects. In IEEE International ASIC/SOC Conference. IEEE Computer Society Press, Washington, D.C., United States.
[16]
Landskov, D., Davidson, S., and Shriver, B. 1980. Local microcode compaction techniques. ACM Comput. Surv. 12, 3 (Sept.), 261--294.
[17]
Lee, C., Lee, J. K., and Hwang, T. 2000. Compiler optimization on instruction scheduling for low power. In Proceedings of 13th International Symposium on System Synthesis. ACM Press, Madrid, Spain, 55--60.
[18]
Lee, M. T.-C., Tiwari, V., Malik, S., and Fujita, M. 1997. Power analysis and minimization techniques for embedded dsp software. IEEE Trans. VLSI Systems 5, 1 (Mar.), 123--133.
[19]
Leupers, R. and Marwedel, P. 1996. Algorithms for address assignment in dsp code generation. In Proceedings of International Conference on Computer-Aided Design. IEEE Computer Society Press, San Jose, California, United States, 109--113.
[20]
Papadimitriou, C. H. 1995. Computational Complexity. Addison-Wesley, 75 Arlington Street, Suite 300, Boston, Massachusetts 02116, United States.
[21]
Prasad, S. C. and Roy, K. 1993. Circuit activity driven multilevel logic optimization for low power reliable operation. In Proceedings of the European Design Automation Conference. IEEE Computer Society Press, Paris, France, 368--372.
[22]
Roy, K. and Prasad, S. C. 1992. Syslop: synthesis of cmos logic for low-power applications. In Proceedings of the International Conference on Computer Design. IEEE Computer Society Press, Cambridge, Massachusetts, United States, 464--467.
[23]
Smith, M. D. 1998. The SUIF Machine Library. Division of Engineering and Applied Sciences, Harvard University, http://www.eecs.harvard.edu/hube/software/v130/machine.html.
[24]
Stanford SUIF Compiler Group. 1994. http://suif.stanford.edu/suif/suif1/docs/suif_toc.html.
[25]
Su, C.-L., Tsui, C. Y., and Despain, A. M. 1994. Low power architecture design and compilation techniques for high-performance processors. In Proceedings of IEEE COMPCON. IEEE Computer Society Press, San Francisco, California, United States, 489--498.
[26]
Texas Instruments Incorporated 1997. TMS320C62xx CPU and Instruction Set Reference Guide. Texas Instruments Incorporated, 12500 TI Boulevard, Dallas, Texas 75243-4136, United States.
[27]
Tiwari, V., Malik, S., and Wolfe, A. 1994a. Compilation techniques for low energy: An overview. In Proceedings of the Symposium on Low Power Electronics. IEEE Computer Society Press, San Diego, CA, United States, 38--39.
[28]
Tiwari, V., Malik, S., and Wolfe, A. 1994b. Power analysis of embedded software: A first step towards software power minimization. IEEE Trans. VLSI Systems 2, 4 (Dec.), 437--445.
[29]
Tiwari, V., Malik, S., Wolfe, A., and Lee, T.-C. 1996. Instruction level power analysis and optimization of software. Journal of VLSI Signal Processing Systems 13, 2 (Aug.), 1--18.
[30]
Tsui, C.-Y., Pedram, M., and Despain, A. M. 1993. Technology decomposition and mapping targeting low power dissipation. In Proceedings of the Design Automation Conference. ACM Press, Dallas, Texas, United States, 68--73.
[31]
W. Ye, N. V., Kandemir, M., and Irwin, M. J. 2000. The design and use of simplepower: A cycleaccurate energy estimation tool. In Proceedings of the Design Automation Conference. ACM Press, Los Angeles, California, United States, 340--345.

Cited By

View all
  • (2024)Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic ProgrammingACM Transactions on Design Automation of Electronic Systems10.1145/364313529:5(1-20)Online publication date: 25-Jan-2024
  • (2024)Automatic Software Tailoring for Optimal PerformanceIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.33306719:3(464-481)Online publication date: May-2024
  • (2024)A Survey on Automatic Source Code Transformation for Green Software GenerationEncyclopedia of Sustainable Technologies10.1016/B978-0-323-90386-8.00122-4(765-779)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Compiler optimization on VLIW instruction scheduling for low power

      Recommendations

      Reviews

      Peter C. Patton

      The goal of this paper is to compare the generic programming aspects of C++ and the Java Development Kit (JDK), based on both qualitative and quantitative factors. The qualitative factors considered include each language's structures and application program interfaces (APIs), the advantages of homogeneity versus heterogeneity, and each language's ease of use. The quantitative factors considered include compiled size, runtime memory usage, and performance. In order to make these comparisons, a simple program, performing three basic operations, is considered. This program reads a file of integers, stores them in a container, sorts the container in ascending order, and, finally prints out the result. The program was initially written in a basic form (in C++ and Java), and then converted into a more generic vector application using Standard Template Library (STL) container and Java Development Kit (JDK) collection classes. Eventually, the example was converted from a vector to a linked list. The study concluded that C++ provides more advanced support for generic programming than Java, but also that many of the generic programming shortfalls in Java could be overcome by considering additional generic types, which are now being actively researched. The paper presents a simple case study as a basis for its comparative work, making this approach useful for students learning advanced programming features like genericity, and for software designers looking for more generic and reusable solutions to ordinary problems. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Design Automation of Electronic Systems
      ACM Transactions on Design Automation of Electronic Systems  Volume 8, Issue 2
      April 2003
      128 pages
      ISSN:1084-4309
      EISSN:1557-7309
      DOI:10.1145/762488
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 01 April 2003
      Published in TODAES Volume 8, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Compilers
      2. VLIW instruction scheduling
      3. instruction bus optimizations
      4. low-power optimization

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 01 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Optimizing VLIW Instruction Scheduling via a Two-Dimensional Constrained Dynamic ProgrammingACM Transactions on Design Automation of Electronic Systems10.1145/364313529:5(1-20)Online publication date: 25-Jan-2024
      • (2024)Automatic Software Tailoring for Optimal PerformanceIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.33306719:3(464-481)Online publication date: May-2024
      • (2024)A Survey on Automatic Source Code Transformation for Green Software GenerationEncyclopedia of Sustainable Technologies10.1016/B978-0-323-90386-8.00122-4(765-779)Online publication date: 2024
      • (2023)PEPA: Performance Enhancement of Embedded Processors through HW Accelerator Resource SharingProceedings of the Great Lakes Symposium on VLSI 202310.1145/3583781.3590277(23-28)Online publication date: 5-Jun-2023
      • (2022)GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPsProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00044(512-529)Online publication date: 1-Oct-2022
      • (2020)Evolutionary Algorithms for Instruction Scheduling, Operation Merging, and Register Allocation in VLIW CompilersJournal of Signal Processing Systems10.1007/s11265-019-01493-292:7(655-678)Online publication date: 1-Jul-2020
      • (2016)Studying the Impact of Bit Switching on CPU EnergyProceedings of the 19th International Workshop on Software and Compilers for Embedded Systems10.1145/2906363.2906382(173-179)Online publication date: 23-May-2016
      • (2016)A New DVFS Algorithm Design for Multi-core Processor ChipComputer Engineering and Technology10.1007/978-981-10-3159-5_5(40-51)Online publication date: 9-Dec-2016
      • (2015)The Design and Experiments of A SID-Based Power-Aware Simulator for Embedded Multicore SystemsACM Transactions on Design Automation of Electronic Systems10.1145/269983420:2(1-27)Online publication date: 2-Mar-2015
      • (2015)Compilers for Low Power with Design Patterns on Embedded Multicore SystemsJournal of Signal Processing Systems10.1007/s11265-014-0917-980:3(277-293)Online publication date: 1-Sep-2015
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media