Revisiting reorder buffer architecture for next generation high performance computing

Min Choi¹,
Jong Hyuk Park² &
Young-Sik Jeong³

343 Accesses
1 Citation
Explore all metrics

Abstract

Modern microprocessors achieve high application performance at an acceptable level of power dissipation. Reorder buffer is used for out-of-order instructions to be committed in-order. The reorder buffer plays a key role in modern microprocessors because performance improvement techniques highly rely on aggressive speculation to feed wider issue, out-of-order, and deep pipelines. In terms of power to performance trade-off, reorder buffer is particularly important. This is because enlarging the reorder buffer size achieves high performance but naive scaling of the conventional reorder buffer architecture can severely increase the complexity and power consumption. In this paper, we propose low-power reorder buffer techniques for contemporary microprocessors. First, the separated reorder buffer reduces power dissipation by deferred allocation and early release. The deferred allocation delays the SROB allocation of instructions until all their data dependencies are resolved. Then, the instructions are executed in program order and they are released faster from the SROB. The result of the instruction is written into rename buffers immediately after the execution completes. Then, the result values in the rename buffer are written into the architectural register file at the commit state. The proposed approaches in this paper provide higher resource utilization and low power consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mitigating Soft Error Rate Through Selective Replication in Hybrid Architecture

A Non-Stop Double Buffering Mechanism for Dataflow Architecture

Article 26 January 2018

Maximizing Limited Resources: a Limit-Based Study and Taxonomy of Out-of-Order Commit

Article Open access 26 April 2018

References

Folegnani D, Gonzalez A (2001) Energy-effective issue logic. In: The proceedings of the IEEE international symposium on computer architecture (ISCA)
Google Scholar
Nan H, Kim KK, Wang W, Choi K (2011) Dynamic voltage and frequency scaling for power-constrained design using process voltage and temperature sensor circuits. J Inf Process Syst 7(1)
Åsberg M, Nolte T, Pettersson P Prototyping and code synthesis of hierarchically scheduled systems using TIMES. J Converg 1(1):75–84
Sathappan OL, Chitra P, Venkatesh P, Prabhu M Modified genetic algorithm for multiobjective task scheduling on heterogeneous computing system. Int J Inf Technol, Commun Converg 1(2):146–158
Ye Y, Li X, Wu B, Li Y A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol, Commun Converg 1(2):206–220
Fisher JD (2009) Design and implementation of low power reorder buffer. Dissertation of University of Texas at San Antonio, 77 p
Cristal A, Santana O, Cazorla F, Galluzzi M, Ramirez T, Pericas M, Valero M (2005) Kilo-instruction processors: overcoming the memory wall. IEEE micro
Google Scholar
Kirman N, Kirman M, Chaudhuri M, Martinez J (2005) Checkpointed early load retirement. In: Proceedings of the international symposium on high-performance computer architecture (HPCA)
Google Scholar
Martinez J, Renau J, Huang M, Prvulovic M, Torrellas J (2002) Cherry: Checkpointed early resource recycling in our-of-order microprocessors. In: Proceedings of the IEEE international symposium on microarchitecture (MICRO)
Google Scholar
Dundas J, Mudge T (1997) Improving data cache performance by pre-executing instructions under a cache miss. In: Proceedings of the ACM international conference on supercomputing (ICS), July 1997
Google Scholar
Mutlu O, Stark J, Wilkerson C, Patt YN (2003) Runahead execution: An alternative to very large instruction windows for out-of-order processors. In: Proceedings of the IEEE international symposium on high performance computer architecture (HPCA), February 2003, pp 129–140
Google Scholar
Kucuk G, Ergin O, Ponomarev D, Ghose K (2003) Distributed reorder buffer schemes for low power. In: International conference on computer design (ICCD)
Google Scholar
Smith JE (1985) Implementation of precise interrupts in pipelined processors. The anatomy of a microprocessor: A system perspective. IEEE CS Press, Los Alamitos
Google Scholar
Brown JA, Porter L, Tullsen DM (2011) Fast thread migration via cache working set prediction. In: International symposium on high performance computer architecture (HPCA)
Google Scholar
Mehrara M, Hsu PC, Samadi M, Mahlke S (2011) Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism. In: International symposium on high performance computer architecture (HPCA)
Google Scholar
Sima D (2000) The design space of register renaming techniques. IEEE micro
Google Scholar
Obaidat MS, Dhurandher SK, Gupta D, Gupta N, Asthana A (2010) DEESR, dynamic energy efficient and secure routing protocol for wireless sensor networks in urban environments. J Inf Process 6(3)
Jerbi K, Wipliez M, Raulet M, Babel M, Déforges O, Abid M Automatic method for efficient hardware implementation from RVC-CAL dataflow: A LAR coder baseline case study. J Converg 1(1):85–92

Download references

Author information

Authors and Affiliations

Department of Information and Communication Engineering, Chungbuk National University, Cheongju, Republic of Korea
Min Choi
Seoul National University of Science and Technology, Seoul, Republic of Korea
Jong Hyuk Park
Wonkwang University, Iksan, Republic of Korea
Young-Sik Jeong

Authors

Min Choi
View author publications
You can also search for this author in PubMed Google Scholar
Jong Hyuk Park
View author publications
You can also search for this author in PubMed Google Scholar
Young-Sik Jeong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Young-Sik Jeong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, M., Park, J.H. & Jeong, YS. Revisiting reorder buffer architecture for next generation high performance computing. J Supercomput 65, 484–495 (2013). https://doi.org/10.1007/s11227-011-0734-x

Download citation

Published: 01 February 2012
Issue Date: August 2013
DOI: https://doi.org/10.1007/s11227-011-0734-x

Revisiting reorder buffer architecture for next generation high performance computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mitigating Soft Error Rate Through Selective Replication in Hybrid Architecture

A Non-Stop Double Buffering Mechanism for Dataflow Architecture

Maximizing Limited Resources: a Limit-Based Study and Taxonomy of Out-of-Order Commit

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Revisiting reorder buffer architecture for next generation high performance computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mitigating Soft Error Rate Through Selective Replication in Hybrid Architecture

A Non-Stop Double Buffering Mechanism for Dataflow Architecture

Maximizing Limited Resources: a Limit-Based Study and Taxonomy of Out-of-Order Commit

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now