Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/517554.825771acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
Article

On Some Implementation Issues for Value Prediction on Wide-Issue ILP Processors

Published: 15 October 2000 Publication History

Abstract

In this paper, we look at two issues, which could affect the performance of value prediction on wide-issue ILP processors. One is the large number of accesses to the value prediction tables needed in each machine cycle, and the other is the latency required to update stale values in the value prediction tables. We introduce a prediction value cache (PVC), which augments the instruction cache to hold the prediction values. Using the PVC, we not only can provide required bandwidth to access multiple prediction values needed in each machine cycle, but also allow us to decouple the value prediction from the critical path in the instruction fetch stage. We use a hybrid value predictor with dynamic classification to perform value prediction in the write back stage, and assume a realistic number of read/write ports, e.g. 2 read/write ports, with queues in their prediction tables. We found good performance for an 8-issue processor using simulations.We also found that, in an 8-Issue processor using SPECint95 benchmark programs, 36% of instructions will access the same value prediction table entry again within 5 cycles, and 22% of instructions will do that within 2 cycles. Unless the prediction tables can be quickly updated, especially for the Stride type and the Two-level type, those value predictions will get stale values and mostly result in mispredictions. We examine several schemes such as attaching an age counter and using speculative update to cope with the problem of delayed updates, but found them not as effective due to the latency required in dynamic classification. If such latency can be reduced, e.g. by using compiler analysis to determine access types at compiler time, the performance could be further improved.

Cited By

View all
  • (2005)Enhancing Memory-Level Parallelism via Recovery-Free Value PredictionIEEE Transactions on Computers10.1109/TC.2005.11754:7(897-912)Online publication date: 1-Jul-2005
  • (2004)A Complexity-Effective Approach to ALU Bandwidth Enhancement for Instruction-Level Temporal RedundancyProceedings of the 31st annual international symposium on Computer architecture10.5555/998680.1006732Online publication date: 19-Jun-2004
  • (2004)A Complexity-Effective Approach to ALU Bandwidth Enhancement for Instruction-Level Temporal RedundancyACM SIGARCH Computer Architecture News10.1145/1028176.100673232:2(376)Online publication date: 2-Mar-2004
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '00: Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
October 2000
ISBN:0769506224

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 15 October 2000

Check for updates

Qualifiers

  • Article

Conference

PACT00
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2005)Enhancing Memory-Level Parallelism via Recovery-Free Value PredictionIEEE Transactions on Computers10.1109/TC.2005.11754:7(897-912)Online publication date: 1-Jul-2005
  • (2004)A Complexity-Effective Approach to ALU Bandwidth Enhancement for Instruction-Level Temporal RedundancyProceedings of the 31st annual international symposium on Computer architecture10.5555/998680.1006732Online publication date: 19-Jun-2004
  • (2004)A Complexity-Effective Approach to ALU Bandwidth Enhancement for Instruction-Level Temporal RedundancyACM SIGARCH Computer Architecture News10.1145/1028176.100673232:2(376)Online publication date: 2-Mar-2004
  • (2004)Scaling the issue window with look-ahead latency predictionProceedings of the 18th annual international conference on Supercomputing10.1145/1006209.1006240(217-226)Online publication date: 26-Jun-2004
  • (2003)Detecting global stride locality in value streamsACM SIGARCH Computer Architecture News10.1145/871656.85965631:2(324-335)Online publication date: 1-May-2003
  • (2003)Detecting global stride locality in value streamsProceedings of the 30th annual international symposium on Computer architecture10.1145/859618.859656(324-335)Online publication date: 9-Jun-2003
  • (2003)Enhancing memory level parallelism via recovery-free value predictionProceedings of the 17th annual international conference on Supercomputing10.1145/782814.782859(326-335)Online publication date: 23-Jun-2003
  • (2002)Latency and energy aware value prediction for high-frequency processorsProceedings of the 16th international conference on Supercomputing10.1145/514191.514201(45-56)Online publication date: 22-Jun-2002
  • (2002)On Augmenting Trace Cache for High-Bandwidth Value PredictionIEEE Transactions on Computers10.1109/TC.2002.103262651:9(1074-1088)Online publication date: 1-Sep-2002
  • (2001)On Table Bandwidth and Its Update Delay for Value Prediction on Wide-Issue ILP ProcessorsIEEE Transactions on Computers10.1109/12.94701250:8(847-852)Online publication date: 1-Aug-2001

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media