Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ISCA.2008.30acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

A Proactive Wearout Recovery Approach for Exploiting Microarchitectural Redundancy to Extend Cache SRAM Lifetime

Published: 01 June 2008 Publication History

Abstract

Microarchitectural redundancy has been proposed as a means of improving chip lifetime reliability. It is typically used in a reactive way, allowing chips to maintain operability in the presence of failures by detecting and isolating, correcting, and/or replacing components on a first-come, first-served basis only after they become faulty. In this paper, we explore an alternative, more preferred method of exploiting microarchitectural redundancy to enhance chip lifetime reliability. In our proposed approach, redundancy is used proactively to allow non-faulty microarchitecture components to be temporarily deactivated, on a rotating basis, to suspend and/or recover from certain wearout effects. This approach improves chip lifetime reliability by warding off the onset of wearout failures as opposed to reacting to them posteriorly. Applied to on-chip cache SRAM for combating NBTI-induced wearout failure, our proactive wearout recovery approach increases lifetime reliability (measured in mean-time-to-failure) of the cache by about a factor of seven relative to no use of microarchitectural redundancy and a factor of five relative to conventional reactive use of redundancy having similar area overhead.

References

[1]
NIST/SEMATECH e-Handbook of Statistical Methods. http://www.itl.nist.gov/div898/handbook/, 2003.
[2]
A. J. Bhavnagarwala, et al. A Pico-Joule Class, 1 GHz, 32 KByte x 64b DSP SRAM with Self Reverse Bias. In Proc. of Symp. on VLSI Circuits Digest of Technical Papers, pages 251-252, June 2003.
[3]
B. C. Paul, et al. Impact of NBTI on the Temporal Performance Degradation of Digital Circuits. IEEE Electron Device Letters, 26(8):560-562, Aug 2005.
[4]
B. Sinharoy, et al. POWER5 system microarchitecture. IBM J of R&D, 49:505-521, 2005.
[5]
S. Borkar. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro, 25(6):10-16, Nov 2005.
[6]
D. C. Bossen, et al. Fault-tolerant design of the IBM pSeries 690 system using POWER4 processor technology. IBM J. of R&D, 46(1), Jan 2002.
[7]
D. Weiss, et al. The On-Chip 3-MB Subarray-Based Third-Level Cache on an Itanium Microprocessor. IEEE J. of Solid-State Circuits, 37(11), Nov 2002.
[8]
F. A. Bower, et al. Tolerating Hard Faults in Microprocessor Array Structures. In Int'l Conf. on Dependable Systems and Networks, pages 51-60, June/July 2004.
[9]
H. Shafi, et al. Design and validation of a performance and power simulator for PowerPC systems. IBM J. of R&D, 47(5/6):641-652, 2003.
[10]
J. Abella, et al. Penelope: The NBTI-Aware Processor. In Proc. of Int'l Symp. on Micro., pages 85-96, Nov 2007.
[11]
J. Shin, et al. A Framework for Architecture-Level Lifetime Reliability Modeling. In Proc. of Int'l Conf. on Dependable Systems and Networks, pages 534-543, June 2007.
[12]
J. Srinivasan, et al. The Case for Lifetime Reliability-Aware Microprocessors. In Proc. of Int'l Symp. on Computer Architecture , pages 276-287, June 2004.
[13]
J. Srinivasan, et al. Exploiting Structural Duplication for Lifetime Reliability Enhancement. In Int'l Symp. on Computer Architecture, June 2005.
[14]
K. Kang, et al. Impact of Negative-Bias Temperature Instability in Nanoscale SRAM Array: Modeling and Analysis. IEEE Trans. on Computer-aided Design of Integrated Circuits and Systems, 26(10):1770-1781, Oct 2007.
[15]
M. A. Lucente, et al. Memory system reliability improvement through associative cache redundancy. IEEE J. of Solid-State Circuits, 26(3):404-409, Mar 1991.
[16]
D. W. Plass and Y. H. Chan. IBM POWER6 SRAM arrays. IBM J. of R&D, 51(6):747-756, Nov 2007.
[17]
R. Kanj, et al. Mixture importance sampling and its application to the analysis of SRAM designs in the presence of rare failure events. In Proc. of Design Automation Conf., pages 69-72, July 2006.
[18]
R. Vattikonda, et al. Modeling and Minimization of PMOS NBTI effect for Robust Nanometer Design. In Proc. of Conf. on Design automation, pages 1047-1052, July 2006.
[19]
S. C. Woo, et al. The SPLASH-2 programs: characterization and methodological considerations. In Proc. of Int'l Symp. on Computer Architecture, pages 24-36, Feb 1995.
[20]
S. Mahapatra, et al. Mechanism of negative bias temperature instability in CMOS devices: degradation, recovery and impact of nitrogen. In Proc. of Int'l Electron Devices Meeting, pages 105-108, Dec 2004.
[21]
S. Tsujikawa, et al. Experimental evidence for the generation of bulk traps by negative bias temperature stress and their impact on the integrity of direct-tunneling gate dielectrics. In Proc. of Symp. on VLSI Technology Digest of Technical Papers, pages 139-140, June 2003.
[22]
S. V. Kumar, et al. Impact of NBTI on SRAM Read Stability and Design for Reliability. In Proc. of Int'l Symp. on Quality Electronic Design, pages 210-218, 2006.
[23]
S. Zafar, et al. Threshold Voltage Instabilities in High-k Gate Dielectric Stacks. IEEE Trans. on Device and Materials Reliability , 5(1):45-64, Mar 2005.
[24]
M. L. Shooman. Reliability of Computer Systems and Networks: Fault Tolerance, Analysis, and Design. Wiley-Interscience, 2001.
[25]
X. Yang, et al. On NBTI Degradation Process in Digital Logic Circuits. In Proc. of Int'l Conf. on VLSI Design, pages 723-730, 2007.
[26]
X. Yang and K. Saluja. Combating NBTI Degradation via Gate Sizing. In Proc. of Int'l Symp. on Quality Electronic Design, pages 47-52, 2007.
[27]
S. Zafar. Statistical mechanics based model for negative bias temperature instability induced degradation. J. of Applied Physics, 97(10), May 2005.
[28]
S. Zafar. A Tutorial on Negative Bias Temperature Instability (NBTI) in MOSFETs. In Proc. of Integrated Reliability Workshop, Oct 2006.

Cited By

View all
  • (2017)Implications of accelerated self-healing as a key design knob for cross-layer resilienceIntegration, the VLSI Journal10.1016/j.vlsi.2016.10.00856:C(167-180)Online publication date: 1-Jan-2017
  • (2016)Area-energy tradeoffs of logic wear-leveling for BTI-induced agingProceedings of the ACM International Conference on Computing Frontiers10.1145/2903150.2903171(37-44)Online publication date: 16-May-2016
  • (2016)Invited - Optimizing device reliability effects at the intersection of physics, circuits, and architectureProceedings of the 53rd Annual Design Automation Conference10.1145/2897937.2905016(1-6)Online publication date: 5-Jun-2016
  • Show More Cited By

Index Terms

  1. A Proactive Wearout Recovery Approach for Exploiting Microarchitectural Redundancy to Extend Cache SRAM Lifetime

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            ISCA '08: Proceedings of the 35th Annual International Symposium on Computer Architecture
            June 2008
            449 pages
            ISBN:9780769531748
            • cover image ACM SIGARCH Computer Architecture News
              ACM SIGARCH Computer Architecture News  Volume 36, Issue 3
              June 2008
              449 pages
              ISSN:0163-5964
              DOI:10.1145/1394608
              Issue’s Table of Contents

            Sponsors

            Publisher

            IEEE Computer Society

            United States

            Publication History

            Published: 01 June 2008

            Check for updates

            Author Tags

            1. lifetime reliability
            2. microarchitectural redundancy
            3. proactive approach
            4. wearout recovery

            Qualifiers

            • Article

            Conference

            ISCA08
            Sponsor:

            Acceptance Rates

            ISCA '08 Paper Acceptance Rate 37 of 259 submissions, 14%;
            Overall Acceptance Rate 543 of 3,203 submissions, 17%

            Upcoming Conference

            ISCA '25

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)6
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 19 Nov 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2017)Implications of accelerated self-healing as a key design knob for cross-layer resilienceIntegration, the VLSI Journal10.1016/j.vlsi.2016.10.00856:C(167-180)Online publication date: 1-Jan-2017
            • (2016)Area-energy tradeoffs of logic wear-leveling for BTI-induced agingProceedings of the ACM International Conference on Computing Frontiers10.1145/2903150.2903171(37-44)Online publication date: 16-May-2016
            • (2016)Invited - Optimizing device reliability effects at the intersection of physics, circuits, and architectureProceedings of the 53rd Annual Design Automation Conference10.1145/2897937.2905016(1-6)Online publication date: 5-Jun-2016
            • (2016)Enhancing the L1 Data Cache Design to Mitigate HCIIEEE Computer Architecture Letters10.1109/LCA.2015.246073615:2(93-96)Online publication date: 1-Jul-2016
            • (2015)HayatProceedings of the 52nd Annual Design Automation Conference10.1145/2744769.2744849(1-6)Online publication date: 7-Jun-2015
            • (2015)EnAAMProceedings of the 52nd Annual Design Automation Conference10.1145/2744769.2744834(1-6)Online publication date: 7-Jun-2015
            • (2015)NBTI alleviation on FinFET-made GPUs by utilizing device heterogeneityIntegration, the VLSI Journal10.1016/j.vlsi.2015.04.00351:C(10-20)Online publication date: 1-Sep-2015
            • (2014)Modeling and Experimental Demonstration of Accelerated Self-Healing TechniquesProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593162(1-6)Online publication date: 1-Jun-2014
            • (2013)Design and implementation of an adaptive proactive reconfiguration technique for SRAM cachesProceedings of the Conference on Design, Automation and Test in Europe10.5555/2485288.2485600(1303-1306)Online publication date: 18-Mar-2013
            • (2013)Employing circadian rhythms to enhance power and reliabilityACM Transactions on Design Automation of Electronic Systems10.1145/2491477.249148218:3(1-23)Online publication date: 29-Jul-2013
            • Show More Cited By

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media