Abstract
Modern microprocessors achieve high application performance at an acceptable level of power dissipation. Reorder buffer is used for out-of-order instructions to be committed in-order. The reorder buffer plays a key role in modern microprocessors because performance improvement techniques highly rely on aggressive speculation to feed wider issue, out-of-order, and deep pipelines. In terms of power to performance trade-off, reorder buffer is particularly important. This is because enlarging the reorder buffer size achieves high performance but naive scaling of the conventional reorder buffer architecture can severely increase the complexity and power consumption. In this paper, we propose low-power reorder buffer techniques for contemporary microprocessors. First, the separated reorder buffer reduces power dissipation by deferred allocation and early release. The deferred allocation delays the SROB allocation of instructions until all their data dependencies are resolved. Then, the instructions are executed in program order and they are released faster from the SROB. The result of the instruction is written into rename buffers immediately after the execution completes. Then, the result values in the rename buffer are written into the architectural register file at the commit state. The proposed approaches in this paper provide higher resource utilization and low power consumption.
Similar content being viewed by others
References
Folegnani D, Gonzalez A (2001) Energy-effective issue logic. In: The proceedings of the IEEE international symposium on computer architecture (ISCA)
Nan H, Kim KK, Wang W, Choi K (2011) Dynamic voltage and frequency scaling for power-constrained design using process voltage and temperature sensor circuits. J Inf Process Syst 7(1)
Åsberg M, Nolte T, Pettersson P Prototyping and code synthesis of hierarchically scheduled systems using TIMES. J Converg 1(1):75–84
Sathappan OL, Chitra P, Venkatesh P, Prabhu M Modified genetic algorithm for multiobjective task scheduling on heterogeneous computing system. Int J Inf Technol, Commun Converg 1(2):146–158
Ye Y, Li X, Wu B, Li Y A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol, Commun Converg 1(2):206–220
Fisher JD (2009) Design and implementation of low power reorder buffer. Dissertation of University of Texas at San Antonio, 77 p
Cristal A, Santana O, Cazorla F, Galluzzi M, Ramirez T, Pericas M, Valero M (2005) Kilo-instruction processors: overcoming the memory wall. IEEE micro
Kirman N, Kirman M, Chaudhuri M, Martinez J (2005) Checkpointed early load retirement. In: Proceedings of the international symposium on high-performance computer architecture (HPCA)
Martinez J, Renau J, Huang M, Prvulovic M, Torrellas J (2002) Cherry: Checkpointed early resource recycling in our-of-order microprocessors. In: Proceedings of the IEEE international symposium on microarchitecture (MICRO)
Dundas J, Mudge T (1997) Improving data cache performance by pre-executing instructions under a cache miss. In: Proceedings of the ACM international conference on supercomputing (ICS), July 1997
Mutlu O, Stark J, Wilkerson C, Patt YN (2003) Runahead execution: An alternative to very large instruction windows for out-of-order processors. In: Proceedings of the IEEE international symposium on high performance computer architecture (HPCA), February 2003, pp 129–140
Kucuk G, Ergin O, Ponomarev D, Ghose K (2003) Distributed reorder buffer schemes for low power. In: International conference on computer design (ICCD)
Smith JE (1985) Implementation of precise interrupts in pipelined processors. The anatomy of a microprocessor: A system perspective. IEEE CS Press, Los Alamitos
Brown JA, Porter L, Tullsen DM (2011) Fast thread migration via cache working set prediction. In: International symposium on high performance computer architecture (HPCA)
Mehrara M, Hsu PC, Samadi M, Mahlke S (2011) Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism. In: International symposium on high performance computer architecture (HPCA)
Sima D (2000) The design space of register renaming techniques. IEEE micro
Obaidat MS, Dhurandher SK, Gupta D, Gupta N, Asthana A (2010) DEESR, dynamic energy efficient and secure routing protocol for wireless sensor networks in urban environments. J Inf Process 6(3)
Jerbi K, Wipliez M, Raulet M, Babel M, Déforges O, Abid M Automatic method for efficient hardware implementation from RVC-CAL dataflow: A LAR coder baseline case study. J Converg 1(1):85–92
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Choi, M., Park, J.H. & Jeong, YS. Revisiting reorder buffer architecture for next generation high performance computing. J Supercomput 65, 484–495 (2013). https://doi.org/10.1007/s11227-011-0734-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-011-0734-x