Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/378993.379243acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Free access

Hardware support for dynamic activation of compiler-directed computation reuse

Published: 12 November 2000 Publication History

Abstract

Compiler-directed Computation Reuse (CCR) enhances program execution speed and efficiency by eliminating dynamic computation redundancy. In this approach, the compiler designates large program regions for potential reuse. During run time, the execution results of these reusable regions are recorded into hardware buffers for future reuse. Previous work shows that CCR can result in significant performance enhancements in general applications. A major limitation of the work is that the compiler relies on value profiling to identify reusable regions, making it difficult to deploy the scheme in many software production environments. This paper presents a new hardware model that alleviates the need for value profiling at compile time. The compiler is allowed to designate reusable regions that may prove to be inappropriate. The hardware mechanism monitors the dynamic behavior of compiler-designated regions and selectively activates the profitable ones at run time. Experimental results show that the proposed design makes more effective utilization of hardware buffer resources, achieves rapid employment of computation regions, and improves reuse accuracy, all of which promote more flexible compiler methods of identifying reusable computation regions.

References

[1]
J. Auslander, M. Philipose, C. Chambers, S. Eggers, and B. Bershad. Fast, effective dynamic compilation. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation, volume 31, pages 149-159, June 1996.
[2]
T. Autrey and M. Wolfe. Initial results for glacial variable analysis. International Journal of Parallel Programming, 26(1), February 1998.
[3]
B. Calder, P. Feller, and A. Eustace. Value profiling. In Proceedings of the 30th Annual International Symposium on Microarchitecture, pages 259-269, December 1997.
[4]
D. Callahan, K. Cooper, K. Kennedy, and L. Torczon. Interprocedural constant propagation. In Proceedings of the Symposium on Compiler Construction, 1986.
[5]
W. H. Chen, C. H. Smith, and S. Fralick. A fast computational algorithm for the discrete cosine transform. IEEE Transactions on Communications, COM-25:1004-1009, September 1977.
[6]
B. C. Cheng and W. W. Hwu. Interprocedural pointer analysis using access paths. In Proceedings of the ACM SIGPLAN '00 Conference on Programming Language Design and Implementation, June 2000.
[7]
D. A. Connors and W. W. Hwu. Compiler-directed computation reuse (CCR). In Proceedings of the 32nd Annual International Symposium on Microarchitecture, pages 158-169, November 1999.
[8]
M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. In Proceedings of the 19th International Conference on Software Engineering, pages 213-224, May 1999.
[9]
Hwu. The Superblock: An effective technique for VLIW and superscalar compilation. The Journal of Supercomputing, 7(1):229-248, January 1993.
[10]
C. Lee and W. Mangione-Smith. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th Annual International Symposium on Microarchitecture, pages 330-335, December 1997.
[11]
D. C. Lee, P. J. Crowley, J. L. Baer, T. E. Anderson, and B. N. Bershad. Execution characteristics of desktop applications on windows nt. In Proceedings of the 25th International Symposium on Computer Architecture, pages 27-38, June 1998.
[12]
M. H. Lipasti, C. B. Wilkerson, and J. P. Shen. Value locality and load value prediction. In Proceedings of 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 138-147, September 1996.
[13]
M. C. Merten, A. R. Trick, and W. W. Hwu. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In Proceedings of the 1999 International Symposium on Computer Architecture, pages 136-147, May 1999.
[14]
E. Rotenberg and J. E. Smith. Trace cache: a low latency approach to high bandwidth instruction fetching. In Proceedings of the 29th International Symposium on Microarchitecture, pages 24-34, December 1996.
[15]
Y. Sazeides and J. E. Smith. The predictability of data values. In Proceedings of the 30th International Symposium on Microarchitecture, pages 248-258, December 1997.
[16]
A. Sodani and G. S. Sohi. Dynamic instruction reuse. In Proceedings of the 25th International Symposium on Computer Architecture, pages 194-205, June 1998.
[17]
M. N. Wegman and F. K. Zadeck. Constant propagation with conditional branches. In Proceedings of the 12th Symposium on Principles of Programming Languages, pages 291-299, January 1985.
[18]
T. Xanthopoulos and A. Chandrakasan. A low-power IDCT macrocell for MPEG-2 exploiting data properties for minimal activity. IEEE Journal of Solid-State Circuits, pages 693-703, May 1999.

Cited By

View all
  • (2018)GRUProceedings of the 2018 International Conference on Supercomputing10.1145/3205289.3205318(43-52)Online publication date: 12-Jun-2018
  • (2014)Eliminating redundant fragment shader executions on a mobile GPU via hardware memoizationProceeding of the 41st annual international symposium on Computer architecuture10.5555/2665671.2665748(529-540)Online publication date: 14-Jun-2014
  • (2014)Eliminating redundant fragment shader executions on a mobile GPU via hardware memoizationACM SIGARCH Computer Architecture News10.1145/2678373.266574842:3(529-540)Online publication date: 14-Jun-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
November 2000
271 pages
ISBN:1581133170
DOI:10.1145/378993
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 28, Issue 5
    Special Issue: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems (ASPLOS '00)
    Dec. 2000
    269 pages
    ISSN:0163-5964
    DOI:10.1145/378995
    Issue’s Table of Contents
  • cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 34, Issue 5
    Dec. 2000
    269 pages
    ISSN:0163-5980
    DOI:10.1145/384264
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2000

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ASPLOS00
ASPLOS00: ASPLOS 2000 Conference
Massachusetts, Cambridge, USA

Acceptance Rates

ASPLOS IX Paper Acceptance Rate 24 of 114 submissions, 21%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)23
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)GRUProceedings of the 2018 International Conference on Supercomputing10.1145/3205289.3205318(43-52)Online publication date: 12-Jun-2018
  • (2014)Eliminating redundant fragment shader executions on a mobile GPU via hardware memoizationProceeding of the 41st annual international symposium on Computer architecuture10.5555/2665671.2665748(529-540)Online publication date: 14-Jun-2014
  • (2014)Eliminating redundant fragment shader executions on a mobile GPU via hardware memoizationACM SIGARCH Computer Architecture News10.1145/2678373.266574842:3(529-540)Online publication date: 14-Jun-2014
  • (2014)Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)10.1109/ISCA.2014.6853207(529-540)Online publication date: Jun-2014
  • (2014)Hinting for Auto-Memoization Processor Based on Static Binary AnalysisProceedings of the 2014 Second International Symposium on Computing and Networking10.1109/CANDAR.2014.49(426-432)Online publication date: 10-Dec-2014
  • (2012)Dynamic Tolerance Region Computing for MultimediaIEEE Transactions on Computers10.1109/TC.2011.7961:5(650-665)Online publication date: 1-May-2012
  • (2008)Computer Architecture Techniques for Power-EfficiencySynthesis Lectures on Computer Architecture10.2200/S00119ED1V01Y200805CAC0043:1(1-207)Online publication date: Jan-2008
  • (2008)SoftSigACM SIGPLAN Notices10.1145/1353536.134630043:3(145-156)Online publication date: 1-Mar-2008
  • (2008)SoftSigACM SIGOPS Operating Systems Review10.1145/1353535.134630042:2(145-156)Online publication date: 1-Mar-2008
  • (2008)SoftSigACM SIGARCH Computer Architecture News10.1145/1353534.134630036:1(145-156)Online publication date: 1-Mar-2008
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media