Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/300979.300995acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Simultaneous subordinate microthreading (SSMT)

Published: 01 May 1999 Publication History

Abstract

Current work in Simultaneous Multithreading provides little benefit to programs that aren't partitioned into threads. We propose Simultaneous Subordinate Microthreading (SSMT) to correct this by spawning subordinate threads that perform optimizations on behalf of the single primary thread. These threads, written in microcode, are issued and executed concurrently with the primary thread. They directly manipulate the microarchitecture to improve the primary thread's branch prediction accuracy, cache hit rate, and prefetch effectiveness. All contribute to the performance of the primary thread. This paper introduces SSMT and discusses its potential to increase performance. We illustrate its usefulness with an SSMT machine that executes subordinate microthreads to improve the branch prediction of the primary thread. We show simulation results for the SPECint95 benchmarks.

References

[1]
D. August, D. Connors, J. Gyllenhaai, and W. Hwu. Support for compiler-synthesized dynamic branch prediction strategies: Rationale and initial results. In Proceedings of the Third IEEE International Symposium on High Performance Computer Architecture, pages 84--93, Feb. 1997.
[2]
D. Bernstein, D. Cohen, A. Freund, and D. Maydan. Compiler techniques for data prefetching on the powerpc. In Proceedings of the I995 ACM/IEEE Conference on Parallel Architectures and Compilation Techniques, 1995.
[3]
M. Charney and T. Puzak. Prefetching and memory system behavior of the spec95 benchmark suite. IBM Journal of Research and Development, 41(3):265-286, May 1997.
[4]
J. C. Dehnert, P. Y. T. Hsu, and J. P. Bratt. Overlapped loop support in the Cydra 5. In Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 26-38, 1989.
[5]
M. Evers, P.-Y. Chang, and Y. N. Part. Using hybrid branch predictors to improve branch prediction accuracy in the presence of context switches. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 3 - 11, 1996.
[6]
M. Evers, S. J. Patel, R. S. Chappell, and Y. N. Patt. An analysis of correlation and predictability: What makes two- Ievel branch predictors work. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 52 - 61, 1998.
[7]
H. Hirata, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizwa. An elementary processor architecture with simultaneous instruction issuing from multiple threads. In Proceedings of the 19th Annual International Symposium on Computer Architecture, 1992.
[8]
M. Horowitz, M. Mar~onosi, T. C. Mowry, and M. D. Smith, Informing memory operations: Providing memory performance feedback in modem processors. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996.
[9]
N. P. Jouppi, Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 364- 373, 1990.
[10]
C.-K. Luk and T. C. Mowry. Cooperative prefetching: Compiler and hardware support for effective instruction prefetching in modem processors. In Proceedings of the 3Ith Annual A CM/IEEE International Symposium on Microarchitecture, 1998.
[11]
S. Mahlke and B. Natarajan. Compiler synthesized dynamic branch prediciton. In Proceedings of the 29th Annual A CM/IEEE hzternational Symposium on Microarchitecture, Dec. 1996.
[12]
S. McFarling. Combining branch predictors. Technical Report TN-36, Digital Western Research Laboratory, June 1993.
[13]
G. M. Papadopoulos and K. R. Traub. Multithreading: A revisionist view of dataflow architectures, in Proceedings o f the 18th Annual International Symposium on Computer Architecture, pages 342-351, 1991.
[14]
Y. N. Patt. Keynote Address, Workshop on Simultaneous Multithreading (HPCA-4), 1998.
[15]
B.J. Smith. A pipelined shared resource mimd computer. In Proceedings of the 1978 International Conference on Parallel Processing, ! 978.
[16]
J. Stark, M. Evers, and Y. N. Patt. Variable length path branch predcition. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 170 - t 79, 1998.
[17]
D. M. Tullsen, S. J. Eggers, J. S. Emer, and H. M. Levy. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 191-202, 1996.
[18]
D. M. Tullsen, S. J. Eggers, and H. M. Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995.
[19]
A. Uht and V. Sindagi. Disjoint eager execution: An optimal form of speculative execution. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 313-325, 1995.
[20]
S. Wallace, B. Calder, and D. Tullsen. Threaded multiple path execution. In Proceedings of the 25th Annual International Symposium on Computer Architecture, I998.
[21]
T.-Y. Yeh and Y. N. Patt. Alternative implementations of two-level adaptive branch prediction. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 124-- 134, t 992.

Cited By

View all
  • (2024)Decoupled Vector Runahead for Prefetching Nested Memory-Access ChainsIEEE Micro10.1109/MM.2024.340689144:4(20-26)Online publication date: 1-Jul-2024
  • (2023)Decoupled Vector RunaheadProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614255(17-31)Online publication date: 28-Oct-2023
  • (2022)CRISP: critical slice prefetchingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507745(300-313)Online publication date: 28-Feb-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture
May 1999
317 pages
ISBN:0769501702
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 27, Issue 2
    Special Issue: Proceedings of the 26th annual international symposium on Computer architecture (ISCA '99)
    May 1999
    298 pages
    ISSN:0163-5964
    DOI:10.1145/307338
    Issue’s Table of Contents

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 May 1999

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ISCA99
Sponsor:

Acceptance Rates

ISCA '99 Paper Acceptance Rate 26 of 135 submissions, 19%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)158
  • Downloads (Last 6 weeks)36
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Decoupled Vector Runahead for Prefetching Nested Memory-Access ChainsIEEE Micro10.1109/MM.2024.340689144:4(20-26)Online publication date: 1-Jul-2024
  • (2023)Decoupled Vector RunaheadProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614255(17-31)Online publication date: 28-Oct-2023
  • (2022)CRISP: critical slice prefetchingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507745(300-313)Online publication date: 28-Feb-2022
  • (2021)Criticality Driven FetchMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480115(380-391)Online publication date: 18-Oct-2021
  • (2021)Branch Runahead: An Alternative to Branch Prediction for Impossible to Predict BranchesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480053(804-815)Online publication date: 18-Oct-2021
  • (2019)Efficient Data Supply for Parallel Heterogeneous ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/331033216:2(1-23)Online publication date: 26-Apr-2019
  • (2019)BootstrappingProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304052(687-700)Online publication date: 4-Apr-2019
  • (2018)SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order coresACM SIGPLAN Notices10.1145/3296979.319239353:4(328-343)Online publication date: 11-Jun-2018
  • (2018)MinnowACM SIGPLAN Notices10.1145/3296957.317319753:2(593-607)Online publication date: 19-Mar-2018
  • (2018)SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order coresProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192393(328-343)Online publication date: 11-Jun-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media