Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/264107.264208acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Data prefetching on the HP PA-8000

Published: 01 May 1997 Publication History

Abstract

Memory latency is a major issue for many modern microprocessor based systems, including the Hewlett-Packard PA-8000. Due to its fast clock rate and wide issue capability, cache misses in the PA-8000 are very expensive. The PA-8000 combines out-of-order execution with multiple outstanding memory requests to tolerate memory latency; however, this approach has its limitations. In order to substantially reduce much of the memory latency penalty, the PA-8000 uses software-based data cache prefetching. In this paper, we discuss the implementation of the data prefetch generation algorithm in the Hewlett-Packard Precision Architecture (HP-PA) compiler. We present performance results for SPECfp95 on a PA-8000 system that show speedups, due to data prefetching, of up to 100%.

References

[1]
Baer, J. L. and T. F. Chen, "An Effectivo On-Chtp Preloading Scheme to Reduce Data Access Penalty" Supercomputing, pp. 176--.186, 1991,
[2]
Bemstein, David, Doron Cohen and Aft Freund, "Compiler Techniques for Data Prefetching on the PowerPC" PACT-95.
[3]
Bryg, W. R., et al., "A High-Performance, Low- Cost Multiproeessor Bus for Workstations and Midrange Servers,' Hewlett-Packard Journal, Febmary, 1996.
[4]
Callahan, D., Ken Kennedy and Allan Porterfield, "Software Prefetching", Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 40---52, 1991.
[5]
Dongarra, J. J. and A. R. Hinds, "Unrolling Loops in FORTRAN" Software--Practice and Experience 9, 3, pp. 219--226, March 1979.
[6]
Gwennap, Linley, "PA-8000 Combines Complexity and Speed" Microprocessor Report 8, 15, pp. 5---9, November 1994.
[7]
Hunt, D. "Advanced Performance Features of the 64-bit PA-8000" COMPCON '95 Digest of Papers, pp. 123---128, March 1995.
[8]
Kane, Gerry, PA-RISC 2.0 Architecture, ISBN 0- 13-182734-0, Prentice-Hall, Englewood Cliffs, NJ, 1996.
[9]
Kroft, D. "Lookup-free Instruction Fetch/Prefetch Cache Organization" International Symposium on Computer Architecture, pp. 81---87, 1981.
[10]
Luk, Chi-Keung and Todd C. Mowry, "Compiler- Based Prefetching for Re.cursive Data Structures" Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 222--233, September 1996.
[11]
Mowry, T. C., Monica S. Lain and Anoop Gupta, "Design and Evaluation of a Compiler Algorithm for Prefetehing" Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 62--73, September 1992.
[12]
Santhanam, V., "Register Reassociation in PA- RISC Compilers, Hewlett-Packard Journal, vol. 43, pp. 33---38, June 1992.
[13]
Smith, A. J., "Cache Memories" Computing Surveys, vol. 14, pp. 473--530, September 1982.
[14]
Wolf, M. E. and Moniea S. Lam, "A Data Locality Optimizing Problem" Proceedings of the SIG- PLAN '91 Conference on Programming Language Design and Implementation, pp. 30----44, June 1991.

Cited By

View all
  • (2014)Multi-stage coordinated prefetching for present-day processorsProceedings of the 28th ACM international conference on Supercomputing10.1145/2597652.2597660(73-82)Online publication date: 10-Jun-2014
  • (2013)Diagnosis and optimization of application prefetching performanceProceedings of the 27th international ACM conference on International conference on supercomputing10.1145/2464996.2465014(303-312)Online publication date: 10-Jun-2013
  • (2009)Optimizations for Memory HierarchiesThe Compiler Design Handbook10.1201/9781420043839.ch5(5-1-5-30)Online publication date: 7-Dec-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture
June 1997
350 pages
ISBN:0897919017
DOI:10.1145/264107
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 25, Issue 2
    Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)
    May 1997
    349 pages
    ISSN:0163-5964
    DOI:10.1145/384286
    Issue’s Table of Contents

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1997

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ISCA97
Sponsor:

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)9
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Multi-stage coordinated prefetching for present-day processorsProceedings of the 28th ACM international conference on Supercomputing10.1145/2597652.2597660(73-82)Online publication date: 10-Jun-2014
  • (2013)Diagnosis and optimization of application prefetching performanceProceedings of the 27th international ACM conference on International conference on supercomputing10.1145/2464996.2465014(303-312)Online publication date: 10-Jun-2013
  • (2009)Optimizations for Memory HierarchiesThe Compiler Design Handbook10.1201/9781420043839.ch5(5-1-5-30)Online publication date: 7-Dec-2009
  • (2008)A software instruction prefetching method in architectures with static schedulingProgramming and Computing Software10.1134/S036176880801006434:1(49-53)Online publication date: 1-Jan-2008
  • (2008)Runtime engine for dynamic profile guided stride prefetchingJournal of Computer Science and Technology10.1007/s11390-008-9159-223:4(633-643)Online publication date: 1-Jul-2008
  • (2005)Improving the performance of GCC by exploiting IA-64 architectural featuresProceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture10.1007/11572961_20(236-251)Online publication date: 24-Oct-2005
  • (2005)Recurrence analysis for effective array prefetching in JavaConcurrency and Computation: Practice and Experience10.1002/cpe.85117:5-6(589-616)Online publication date: 22-Feb-2005
  • (2004)Cache Refill/Access Decoupling for Vector MachinesProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.9(331-342)Online publication date: 4-Dec-2004
  • (2003)Receiving message prediction methodParallel Computing10.1016/j.parco.2003.05.00529:11-12(1509-1538)Online publication date: 1-Nov-2003
  • (2002)Timekeeping in the memory systemProceedings of the 29th annual international symposium on Computer architecture10.5555/545215.545239(209-220)Online publication date: 25-May-2002
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media