High-performance frontends for trace processors

January 1999

Author:
Quinn Able Jacobson,
Supervisor:
James E. Smith

Publisher:

The University of Wisconsin - Madison

ISBN:978-0-599-50587-2

Order Number:AAI9938730

Pages:

298

Purchase on ProQuest

Bibliometrics

Abstract

Trace processors use a new microarchitecture organization that achieves higher performance than conventional superscalar processors. Trace processors can be logically broken into two main parts, the frontend (instruction fetch) and the backend (instruction execution). In the frontend a trace processor uses a trace cache to enable it to fetch across multiple branches in a single cycle. The trace cache records short dynamic sequences of instructions, traces, and can provide one trace of instructions per cycle when a path is repeated. Trace processors use a distributed backend, which consists of simple processing elements that are replicated for high aggregate bandwidth. Traces are dispatched by the fronted, one per processing element, to the backend.

This thesis proposes three mechanisms that enable very high-performance frontends for trace processors. The first mechanism, trace pre-construction, augments the trace cache by performing a task analogous to prefetching. It increases both the average performance of the trace cache and the robustness of the trace cache to varying workloads. Pre-construction can reduce the trace cache miss rates by up to 80% for the SPECint95 benchmarks.

The second mechanism, instruction pre-processing, takes advantage of the trace cache to dynamically optimize program binaries. It can perform transformations that both dynamically optimize common instruction sequences and take advantage of implementation-specific hardware. The dynamic optimizations, performed in the frontend, expose more parallelism to the trace processor backend. Three specific optimizations are considered: instruction scheduling, constant propagation and instruction collapsing. Together these optimizations increase performance by up to 20% for the SPECint95 benchmarks.

The third mechanism, next-trace prediction, is a control flow predictor that matches the bandwidth of the trace cache without sacrificing prediction accuracy. It performs the functionality of branch prediction and branch target prediction, and works in units of traces, so its bandwidth is perfectly matched to the trace cache. Next-trace prediction has prediction accuracy comparable to the best traditional branch predictors, while providing significantly higher branch throughput.

Cited By

Contributors

Quinn Able Jacobson
Sun Microsystems
- Publication Years1997 - 2000
- Publication counts8
- Citation count687
- Available for Download6
- Downloads (cumulative)4,749
- Downloads (12 months)465
- Downloads (6 weeks)79
- Average Downloads per Article792
- Average Citation per Article86
View Full Profile
Jim Smith
University of the West of England
- Publication Years1976 - 2024
- Publication counts78
- Citation count1,906
- Available for Download35
- Downloads (cumulative)19,996
- Downloads (12 months)2,566
- Downloads (6 weeks)325
- Average Downloads per Article571
- Average Citation per Article24
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Support for speculative execution in high-performance processors
Support for Speculative Execution in High-Performance Processors
Trace processors
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture

Traces are dynamic instruction sequences constructed and cached by hardware. A microarchitecture organized around traces is presented as a means for efficiently executing many instructions per cycle. Trace processors exploit both control flow and data ...

Browse Theses

Sections

Cited By

Support for speculative execution in high-performance processors

Support for Speculative Execution in High-Performance Processors

Trace processors

Sections

Cited By

Save to Binder

Recommendations

Support for speculative execution in high-performance processors

Support for Speculative Execution in High-Performance Processors

Trace processors