Abstract
Statistical data from many application fields confirm that System on Chips (SoCs) products implemented in modern deep submicron technologies are getting more and more sensitive to transient errors such as soft-errors. Although the thorough and comprehensive understanding of services that an SoCs provides is an important step for meeting stringent system requirements, designers no longer can ignore emerging safety and reliability issues in nanoscale devices. In fact, proper actions should be taken at various stages of system design to mitigate the effect of such errors and enhance safety of SoC in fault prone environments. Therefore, SoC designs can benefit from knowing the soft-error rate (SER) of different cores as well as the whole system failure rate at a very early stage of SoC development. Such data enables companies and designers to make the right decision at the right time concerning the intensity of error protection mechanisms across different modules. This paper proposes a new quantitative method to estimate the SER of different modules inside an SoC by means of an executable model. The executable model of a system is based on the Unified Modeling Language Real-Time standard and is exercised by the actual workload. Experimental results show that the proposed quantitative method is 17 % more accurate than the previous error estimations techniques.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Mitra S, Karnik T, Seifert N, Zhang M (2005) Logic soft errors in sub-65nm technologies design and CAD challenges. In: Proc. design automation conference (DAC), pp 2–4
Savino A, Carlo SD, Benso A, Bosio A, Di Natale G (2012) Statistical reliability estimation of microprocessor-based systems. IEEE Trans Comput 61:1521–1534
Li X, Adve SV, Bose P, Rivers JA (2008), Online estimation of architectural vulnerability factor for soft errors. In: Proceedings of the 35th international symposium on computer, architecture, pp 341–352
Sridharan V, Kaeli DR (2010) Using pvf traces to accelerate avf modeling. In: Proceedings of the IEEE workshop on silicon errors in logic—system effects, Stanford, California
Zhe ZMa, Catthoor F, Vermunt F, Hendriks T (2010) System-level analysis of soft error rates and mitigation trade-off explorations. In: Proc. reliability physics symposium conference (IRPS), pp 1014–1018
Heidergott W (2005) SEU tolerant device, circuit and process design. In: Proc. 42nd design automation conference (DAC), pp 5–10
Calin T, Nicolaidis M, Velazco R (1996) Upset hardened memory design for submicron CMOS technology. IEEE Trans Nucl Sci 43(6)
Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute architectural vulnerability factors for a high-performance microprocessor. In: Proc. international symposium on microarchitecture (MICRO), pp 29–40
Nguyen HT, Yagil Y (2003) A systematic approach SER estimation and solutions. In: Proc. IEEE international reliability physics symposium (IRPS), pp 60–70
Seifert N, Tam N (2004) Timing vulnerability factors of sequentials. IEEE Trans Device Mater Reliab 4(3):516–522
Wasserman GS (2002) Reliability verification, testing, and analysis in engineering design. Marcel Dekker Incorporated, New York
Hosseinabady M, Neishaburi MH, Navabi Z, Benso Alfredo, Di Carlo S, Prinetto P, Di Natale G (2007) Analysis of system-failure rate caused by soft-errors using a UML-based systematic methodology in an SoC. In: Proc. IEEE intl. on-line testing symposium (IOLTS), pp 205–206
Papadopoulos Y, McDermid J, Sasse R, Heiner G (2001) Analysis and synthesis of the behavior of complex programmable electronic systems in conditions of failure. Reliab Eng Syst Saf 71:229–247
Pinello C, Carloni LP, Sangiovanni-Vincentelli AL (2004) Fault-tolerant deployment of embedded software for cost-sensitive real-time feedback-control application. In: Proc. design automation and test in Europe conference (DATE’04), pp 1164–1169
McKelvin ML, Sprinkle J, Pinello C, Sangiovanni-Vincentelli A (2005), Fault tolerant data flow modeling using the generic modeling environment. In: Proc. IEEE international conference and workshop engineering computer-based, system (ECBS’05), pp 229–235
Sangiovanni-Vincentelli A, Carloni L, De Bernardinists F, Sgroi M (2004) Benefits and challenges for platform-based design. In Proc. design automation conference (DAC), pp 409–414
Neishaburi MH, Kakoee MR, Daneshtalab M, Safari S (2007) HW/SW architecture for soft-error cancellation in real-time operating system. IEICE Electron Express 4(23):755–761
Ferreira P, Sampaio A, Mota A (2008) Viewing CSP specifications with UML-RT diagrams. Electron Notes Theoret Comput Sci 195:57–74
Neishaburi MH, Zilic Z (2011) On failure rate assessment using an executable model of the system. In: Proc. digital system design (DSD), pp 29–36
Selic B (2000) A generic framework for modeling resources with UML. IEEE Comput, pp 64–69
OMG (2003) UML profile for schedulability, performance, and time specification
Lyons A (1998) UML for real-time overview. Technical Report, Prentice-Hall International, Object Time Limited
Rational Rose RealTime, “Modeling Language Guide”, Version 2003.06.00, http://www.rational.com
Neishaburi MH, Zilic Z (2009) Reliability aware NoC router architecture using input channel buffer sharing. In: Proceedings of great lake symposium on VLSI (GLSVLSI), pp 511–516
Neishaburi MH, Zilic Z (2011) ERAVC: Enhanced reliability aware NoC router. In: Proceedings of international symposium on quality electronic design (ISQED), pp 591–596
Neishaburi MH, Zilic Z (2013) NISHA: A Fault-tolerant NoC router enabling deadlock-free interconnection of subsets in hierarchical architecture. J Syst Archit (JSA)
Neishaburi MH, Zilic Z (2011) Hierarchical embedded logic analyzer for accurate root-cause analysis. In: Proceedings of international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT), pp 120–128
Neishaburi MH, Zilic Z (2011) A fault tolerant hierarchical network on chip router architecture. In: Proceedings of international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT), pp 445–453
Neishaburi MH, Zilic Z (2012) An infrastructure for debug using clusters of assertion checkers. Microelectronics Reliability 52(11):2781–2798
Hosseinabady M, Neishaburi MH, Lotfi-Kamran P, Navabi Z (2007) A UML based system level failure rate assessment technique for SoC designs. In: Proc. VLSI test symposium (VTS), pp 6–10
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Neishaburi, M.H., Zilic, Z. System on chip failure rate assessment using the executable model of a system. Computing 97, 611–629 (2015). https://doi.org/10.1007/s00607-013-0372-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-013-0372-7