Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Compiler-Assisted Multiple Instruction Word Retry for VLIW Architectures

Published: 01 December 2001 Publication History

Abstract

Very Long Instruction Word (VLIW) architectures can enhance performance by exploiting fine-grained instruction level parallelism. In this paper, we describe a compiler assisted multiple instruction word retry scheme for VLIW architectures. A read buffer is used to resolve the more frequent on-path hazards, while the compiler resolves the remaining branch hazards. Performance evaluation is described for 11 benchmark programs based on the IBM VLIW research compiler, Chameleon. Experimental results indicate that, for a VLIW machine with P functional units to rollback N instruction words, a read buffer of 2NP entries with the compiler assist can be an effective approach in producing low overhead runtime performance and small code growth, for P = 4, 8, 12, and 16 and N \leq 3.

References

[1]
RP. Coiwell, R.P. Nix, JJ, O'Donnell, D.B. Papworth, and P.K. Rodman, "A VLIW Architecture for a Trace Scheduling Compiler," Proc. Second Int'l Conf Architectural Support for Programming Languages and Operating Systems, pp. 180-192, 1987]]
[2]
J.A. Fisher, "Very Long Instruction Word Architectures and the ELI-512," Proc. 10th Ann, Int'l Symp. Computer Architecture, pp. 140150, 1983.]]
[3]
BR. Rau, D. Yen, W. Yen, and R.A. Towle, "The Cydra 5 Departmental Supercomputer," Computer, pp. 12-35, Jan. 1989.]]
[4]
JR. Ellis, Bulldog: A Compiler for VLIW Architectures. MIT Press, 1986.]]
[5]
J.A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Trans. Computers, vol. 30, no. 7, pp. 478-490, July 1981.]]
[6]
W.-M.W. Hwu, S. Mahlke, W. Chen, P. Chang, N. Warter, It. Bringmann, R, Ouellette, R, Hank, T. Kiyohara, G. Haab, J. HoIm, and D. Lavery, 'The Superblock: An Effective Technique for VLIW and Superscalar Compilation," I. Supercomputing, pp. 229-248, July 1993.]]
[7]
K. 'Ebcioglu and T. Nakatani, "A New Compilation Technique for Parallelizing Loops with Unpredictable Branches on a VLIW Architecture," Languages and Compilers for Parallel Computing, pp. 213-229, 1989.]]
[8]
MS. Lam, "Software Pipelining: An Effective Scheduling Technique for VLIW Machines," Proc. ACM SIGPLAN 1988 Conf. Programming Language Design and Implementation, pp. 318328, 1988.]]
[9]
S. Banerjia, K.N. Menezes, SW. Sathaye, and T.M. Conte, "Mrs: Miss-Path Scheduling for Multiple Issue Processors," IEEE Trans. Computers, vol. 47, no. 12, pp. 1382-1397, Dec. 1998.]]
[10]
X. Castillo, S.R. McConnel, and D.P. Siewiorek, 'Derivation and Calibration of a Transient Error Reliability Model," IEEE Trans. Computers, vol. 31, no. 7, pp. 658-671, July 1982.]]
[11]
It. lyer and D. Rossetti, "A Measurement-Based Model for Workload Dependence of CPU Errors," IEEE Trans. Computers, vol. 35, no. 6, pp. 511-519, June 1986.]]
[12]
C.-C.J. Li and WK. Fuchs, "CATCH-Compiler-Assisted Techniques for Checkpointing," Proc. 20th Int'l Symp. Fault-Tolerant Computing, pp. 74-81, June 1990.]]
[13]
W.-M.W. Hwu and Y.N. Part, 'Checkpoint Repair for HighPerformance Out-of-Order Execution Machines," IEEE Trans. Computers, vol. 36, no. 12, pp. 1496-1514, Dec. 1987.]]
[14]
M.L. Ciacelli, "Fault Handling on the IBM 4341 Processor," Proc. 11th Int'l Symp. FaultTolerant Computing, pp. 9-12, June 1981.]]
[15]
Y. Tamir and M. Tremblay, "High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback," IEEE Trans. Computers, vol. 39, no. 4, pp. 548-554, Apr. 1990.]]
[16]
MS. Pittler, D.M. Powers, and DL. Schnabel, 'System Development and Technology Aspects of the IBM 3081 Processor Complex," IBM J. Research and Development, vol. 26, pp. 2-11, Jan. 1982.]]
[17]
W.F. Bruckert and RE. Josephson, "Designing Reliability into the VAX 8600 System," Digital Technical I. Digital Equipment Corporation, pp. 71-77, Aug. 1985.]]
[18]
L. Spainhower, J. Isenberg, R. Chillarege, and J. Berding, "Design for Fault-Tolerance in System ES/9000 Model 900," Proc. 22nd Int'l Symp. Fault-Tolerant Computing, pp. 38-47, July 1992.]]
[19]
J.S. Liptay, "Design of the IBM Enterprise System/9000 High End Processor," IBM I. Research and Development, vol. 36, no. 4,pp. 713731, July 1992.]]
[20]
}20} L. Spainhower and T.A. Gregg, "G4: A Fault-Tolerant CMOS Mainframe," Proc. 28th Int'l Symp. Fault-Tolerant Computing, pp. 432-440, 1998.]]
[21]
C.-C.J. Li, 5.-K. Chen, WK. Fuchs, and W.-M.W. Hwu, "CompilerBased Multiple Instruction Refry," IEEE Trans. Computers, vol. 44, no. 1, pp. 3546, Jan. 1995.]]
[22]
S.-K. Chen, N.J. Alewine, WK. Fuchs, and W.-M.W. Hwu, "Incremental Compiler Transformations for Multiple Instruction Retry," Software-Practice & Experience, vol. 24, no. 12, pp. 11791198. Dec. 1994.]]
[23]
N.J. Alewine, 5.-K. Chen, W.K. Fuchs, and W.-M.W. Hwu, "Compiler Assisted Multiple Instruction Rollback Retry Using a Read Buffer," IEEE Trans. Computers, vol. 44, no. 9, pp. 10961107, Sept. 1995.]]
[24]
D.K. Pradhan, Fault-Tolerant Computing: Theory and Techniques, Volumn I. Prentice Hall, 1986.]]
[25]
K. Wilken and J.P. Shen, "Continuous Signature Monitoring: LowCost Concurrent Detection of Processor Errors," IEEE Trans. Computer-Aided Design, vol. 9, no. 6, pp. 629-641, June 1990.]]
[26]
J. Ohlsson and M. Rimen, "Implicit Signature Checking," Proc. 25th Int'l Symp. FaultTolerant Computing, pp. 218-227, June 1995.]]
[27]
C.L. Chen, N.N. Tendolkar, A.J. Sutton, MY. Hsiao, and D.C. Bossen, "Fault-Tolerance Design of the IBM Enterprise System/9000 Type 9021 Processors," IBM J. Research and Development, vol. 36, no. 4, pp. 765-779, July 1992.]]
[28]
Y. Tamir, M. Liang, T. Lai, and M. Tremblay, "The UCLA Mirror Processor: A Building Block for Self-Checking Self-Repairing Computing Nodes," Proc. 21st Int'l Sytnp. FaultTolerant Computing, pp. 178-185, June 1991.]]
[29]
D.M. Blough and A. Nicolau, "Fault Tolerance in Super-Scalar and VLIW Processors," Proc. 1992 IEEE Workshop Fault-Tolerant Parallel and Distributed Systems, pp. 193-200, 1992.]]
[30]
J.G. HoIm and P. Banerjee, "Low Cost Concurrent Error Detection in a VLIW Architecture Using Replicated Instructions," Proc. Int'l Conf. Parallel Processing, pp. 192-195, 1992.]]
[31]
M.A. Schuette and J.P. Shen, "Exploiting Instruction-Level Resource Parallelism for Transparent Integrated Control-Flow Monitoring," Proc. 21st Int'l Symp. Fault-Tolerant Computing, pp. 318-325, 1991.]]
[32]
5.-K. Chen, W.IC. Fuchs, and W.-M.W. I-Iwu, "The Application of Compiler-Assisted Multiple Instruction Retry to VOW Architectures," Proc. 1994 IEEE Workshop Fault-Tolerant Parallel and Distributed Systems, pp. 51-58, June 1994.]]
[33]
A. Aiken and A. Nicolau, "A Development Environment for Horizontal Microcode," IEEE Trans. Software Engineering, vol. 14, no. 5, pp. 584-594, May 1988.]]
[34]
K. Ebcioglu, "Some Design Ideas for a VLIW Architecture for Sequential Natured Software," Proc. Int'l Federation for Information Processing 10.3 Working Conf. Parallel Processing, pp. 3-21, Apr. 1988.]]
[35]
S.-M. Moon and S.D. Carson, "Generalized Multiway Branch Unit for VLIW Microprocessors," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 8, pp. 850-862, Aug. 1995.]]
[36]
M. Johnson, Superscalar Microprocessor Design. Prentice Hall, 1991.]]
[37]
D.A. Padua and M.J. Wolfe, "Advanced Compiler Optimizations for Supercomputers," Comm. ACM, vol. 29, pp 1184-1201, Dec. 1986.]]
[38]
AN. Aho, R. Sethi, and J.D. UlIman, Compilers: Principles, Techniques, and Tools. AddisonWesley, 1986.]]
[39]
J.F1. Moreno, M. Moudgill, K. Ebcioglu, E. Altman, C.B. Hall, R. Miranda, S.-K. Chen, and A. Polyak, "Simulation/Evaluation Environment for a VLIW Processor Architecture," IBM /, Research and Development, vol. 41, no. 3, pp. 287-302, May 1997.]]
[40]
M. Moudgili, J.H. Moreno, K. Ebcioglu, E. Altman, S.-K. Chen, and A. Polyak, "Compiler/Architecture Interaction in a Tree-Based VLIW Processor," IEEE Technical Committee on Computer Architecture Newsletter, pp. 10-12, June 1997.]]
[41]
S.-K. Chen, WK. Fuchs, and W.-M.W. Hwu, "An Analytical Approach to Scheduling Code for Superscalar and VLIW Architectures," Proc. Int'l Conf. Parallel Processing, vol. 1, pp. 285292, Aug. 1994.]]
[42]
K. Ebcioglu, J. Fritts, S. Kosonocky, M. Gschwind, E. Altman, and K. Kailas, "An Eight Issue Tree-VLIW Processor for Dynamic Binary Translation," Proc. Int'l Conf Computer Design, Oct. 1998.]]

Cited By

View all
  • (2018)An error recoverable structure based on complementary logic and alternating-retryJournal of Computer Science and Technology10.1007/s11390-005-0885-420:6(885-894)Online publication date: 21-Dec-2018
  • (2018)Using Error Correcting Codes Without Speed Penalty in Embedded MemoriesJournal of Electronic Testing: Theory and Applications10.1007/s10836-013-5386-829:3(383-400)Online publication date: 28-Dec-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 12, Issue 12
December 2001
148 pages

Publisher

IEEE Press

Publication History

Published: 01 December 2001

Author Tags

  1. Fault-tolerant computing
  2. VLIW architectures
  3. compilers
  4. instruction level parallelism
  5. instruction retry

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)An error recoverable structure based on complementary logic and alternating-retryJournal of Computer Science and Technology10.1007/s11390-005-0885-420:6(885-894)Online publication date: 21-Dec-2018
  • (2018)Using Error Correcting Codes Without Speed Penalty in Embedded MemoriesJournal of Electronic Testing: Theory and Applications10.1007/s10836-013-5386-829:3(383-400)Online publication date: 28-Dec-2018

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media