Abstract
In the era of platforms hosting multiple applications with variable reliability needs, worst-case platform-wide fault-tolerance decisions are neither optimal nor desirable. As a solution to this problem, designs commonly employ adaptive fault-tolerance strategies that provide each application with the reliability level actually needed. However, in the CGRA domain, the existing schemes either only allow to shift between different levels of modular redundancy (duplication, triplication, etc.) or protect only a particular region of a device (e.g. configuration memory, computation, or data memory). To complement these strategies, we propose private fault-tolerance environments which, in addition to modular redundancy, also provide low cost sub-modular (e.g. residue mod 3) redundancy capable of handling both permanent and temporary faults in configuration memory, computation, communication, and data memory. In addition, we also present adaptive configuration scrubbing techniques which prevent fault accumulation in the configuration memory. Simulation results using a few selected algorithms (FFT, matrix multiplication, and FIR filter) show that the approach proposed is capable of providing flexible protection with energy overhead ranging from 3.125 % to 107 % for different reliability levels. Synthesis results have confirmed that the proposed architecture reduces the area overhead for self-checking (58 %) and fault-tolerant (7.1 %) versions, compared to the state of the art adaptive reliability techniques.
Similar content being viewed by others
References
Zain-ul-Abdin, Svensson B (2009) Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing. Microprocess Microsyst 33(3):161–178
Shami MA (2012) Dynamically reconfigurable resource array. PhD dissertation, Royal Institute of Technology (KTH), Stockholm, Sweden
Alnajjar D, Konoura H, Ko Y, Mitsuyama Y, Hashimoto M, Onoye T (2012) Implementing flexible reliability in a coarse-grained reconfigurable architecture. IEEE Trans Very Large Scale Integr (VLSI) Syst 99:1
David R, Chillet D, Pillement S, Sentieys O (2002) DART: a dynamically reconfigurable architecture dealing with future mobile telecommunications constraints. In: Proceedings of the 16th international parallel and distributed processing symposium (IPDPS ’02), Fort Lauderdale, FL, USA, 15–19 April 2002, pp 156–165
Borkar S (2004) Microarchitecture and design challenges for gigascale integration. In: Proceedings of the 37th annual IEEE/ACM international symposium on microarchitecture (MICRO 37), Portland, OR, USA, p 3
Pirretti M, Link GM, Brooks RR, Vijaykrishnan N, Kandemir M, Irwin MJ (2004) Fault tolerant algorithms for network-on-chip interconnect. In: Proceedings of the IEEE computer society annual symposium on VLSI, Lafayette, LA, USA, 19–20 February 2004, pp 46–51
ITRS (2011). Internatinal technology roadmap for semiconductors 2011 edition: executive summary [online]. Available: http://www.itrs.net/Links/2011ITRS/2011Chapters/2011ExecSum.pdf
Alnajjar D, Ko Y, Imagawa T, Konoura H, Hiromoto M, Mitsuyama Y, Hashimoto M, Ochi H, Onoye T (2009) Coarse-grained dynamically reconfigurable architecture with flexible reliability. In: Proceedings of the international conference on field programmable logic and applications, pp 186–192
Jafri SMAH, Piestrak S, Sentieys O, Pillement S (2010) Design of a fault-tolerant coarse-grained reconfigurable architecture: a case study. In: Proceedings of the 11th international symposium on quality electronic design (ISQED 2010), San Jose, CA, USA, 22–24 March 2010, pp 845–852
Jafri SMA, Guang L, Hemani A, Paul K, Plosila J, Tenhunen H (2012) Energy-aware fault-tolerant network-on-chips for addressing multiple traffic classes. In: Proceedings of the EUROMICRO conference on digital system design (DSD), Izmir, Turkey, 5–8 September 2012, pp 242–249
Jafri SMAH, Piestrak SJ, Hemani A, Paul K, Plosila J, Tenhunen H (2013) Energy-aware fault-tolerant CGRAs addressing application with different reliability needs. In: Proceedings of the 16th EUROMICRO conference on digital system design (DSD 2013), Santander, Spain, 4–6 September 2013, pp 525–534
Veredas F-J, Scheppler M, Moffat W, Mei B (2005) Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes. In: Proceedings of the international conference on field programmable logic and applications (FPL 2005), Tampere, Finland, 24–26 August 2005, pp 106–111
Amano H, Hasegawa Y, Tsutsumi S, Nakamura T, Nishimura T, Tanbunheng V, Parimala A, Sano T, Kato M (2007) MuCCRA chips: configurable dynamically-reconfigurable processors. In: Proceedings of the IEEE Asian solid-state circuits conference (ASSCC ’07), Jeju, Korea, 12–14 November 2007, pp 384–387
Lipetz D, Schwarz E (2011) Self checking in current floating-point units. In: Proceedings of the IEEE symposium on computer arithmetic (ARITH), Tübingen, Germany, 25–27 July 2011, pp 73–76
Jafri SMAH, Piestrak SJ, Hemani A, Paul K, Plosila J, Tenhunen H (2013) Implementation and evaluation of configuration scrubbing on CGRAs: a case study. In: Proceedings of the international symposium on system-on-chip (SoC2013), Tampere, Finland, 23–24 October 2013
Singh H, Lee MH, Lu G, Kurdahi FJ, Bagherzadeh N, Filho EMC (2000) Morphosys: an integrated reconfigurable system for data-parallel computation-intensive applications. IEEE Trans Comput 49(5):465–481
Rauwerda G, Heysters P, Smit G (2008) Towards software defined radios using coarse-grained reconfigurable hardware. IEEE Trans Very Large Scale Integr (VLSI) Syst 16(1):3–13
Shami MA, Hemani A (2012) Classification of massively parallel computer architectures. In: Proceedings of the IEEE international parallel and distributed processing symposium workshops PhD forum (IPDPSW), Shanghai, China, 21–25 May 2012, pp 344–351
Lehtonen T (2009) On fault tolerance methods for networks-on-chip. PhD dissertation, University of Turku, Department of Informational Technology, Turku, Finland
Worm F, Ienne P, Thiran P, De Micheli G (2002) An adaptive low-power transmission scheme for on-chip networks. In: Proceedings of the 15th international symposium on system synthesis, Kyoto, Japan, 2–4 October 2002, pp 92–100
Worm F, Ienne P, Thiran P, De Micheli G (2005) A robust self-calibrating transmission scheme for on-chip networks. IEEE Trans Very Large Scale Integr (VLSI) Syst 13(1):126–139
Li L, Vijaykrishnan N, Kandemir M, Irwin MJ (2003) Adaptive error protection for energy efficiency. In: Proceedings of the international conference on computer aided design (ICCAD), San Jose, CA, USA, 9–13 November 2003, pp 2–7
Rossi D, Angelini P, Metra C (2007) Configurable error control scheme for NoC signal integrity. In: Proceedings of the 13th IEEE international on-line testing symposium (IOLTS), Crete, Greece, 8–11 July 2007, pp 43–48
Yu Q, Ampadu P (2009) Adaptive error control for nanometer scale network-on-chip links. IET Comput Dig Tech 3(6):643–659
Qiaoyan Y, Ampadu P (2010) Transient and permanent error co-management method for reliable networks-on-chip. In: Proceedings of the fourth ACM/IEEE international networks-on-chip symposium (NOCS), Grenoble, France, 3–6 May 2010, pp 145–154
Berg M (2007) The NASA Goddard space flight center radiation effects and analysis group Virtex 4 scrubber. In: Annual Xilinx radiation test consortium (XRTC) meeting
Heiner J, Sellers B, Wirthlin M, Kalb J (2009) FPGA partial reconfiguration via configuration scrubbing. In: Proceedings of the international conference on field programmable logic and applications (FPL 2009), Prague, Czech Republic, 31 August–2 September 2009, pp 99–104
Lee J-Y, Chang C-R, Jing N, Su J, Wen S, Wong R, He L (2012) Heterogeneous configuration memory scrubbing for soft error mitigation in FPGAs. In: Proceedings of the international conference on field-programmable technology (FPT), Seoul, South Korea, 10–12 December 2012, pp 23–28
Herrera-Alzu I, López-Vallejo M (2013) Design techniques for Xilinx Virtex FPGA configuration memory scrubbers. IEEE Trans Nucl Sci 60(1):376–385
Berg M, Poivey C, Petrick D, Espinosa D, Lesea A, LaBel K, Friendlich M, Kim H, Phan A (2008) Effectiveness of internal versus external SEU scrubbing mitigation strategies in a Xilinx FPGA: design, test, and analysis. IEEE Trans Nucl Sci 55(4):2259–2266
Martin-Ortega A, Alvarez M, Esteve S, Rodriguez S, Lopez-Buedo S (2008) Radiation hardening of FPGA-based SoCs through self-reconfiguration and XTMR techniques. In: Proceedings of the 4th southern conference on programmable logic, San Carlos de Bariloche, Argentina, 26–28 March 2008, pp 261–264
Jones L (2007) Single event upset (SEU) detection and correction using Virtex-4 devices. Xilinx Ltd, San Jose, CA, January 2007, application note XAPP714
Suh J, Annavaram M, Dubois M (2012) MACAU: a Markov model for reliability evaluations of caches under single-bit and multi-bit upsets. In: Proceedings of the IEEE 18th international symposium on high-performance computer architecture (HPCA ’12), Washington, DC, USA, 25–29 February 2012, pp 3–14
Suh J, Manoochehri M, Annavaram M, Dubois M (2011) Soft error benchmarking of L2 caches with PARMA. ACM SIGMETRICS Perform Eval Rev 39(1):85–96
Farahini N, Li S, l Tajammul MA, Shami MA, Chen G, Hemani A, Ye W (2013) 39.9 GOPs/Watt multi-mode CGRA accelerator for a multi-standard base station. In: Proceedings of the IEEE international symposium on circuits and systems (ISCAS), Beijing, China, 19–23 May 2013, pp 1448–1451
Farahini N (2011) An improved hierarchical design flow for coarse grain regular fabrics. Master’s thesis, Royal Institute of Technology (KTH), Stockholm, Sweden
Jafri SMAH, Hemani A, Paul K, Plosila J, Tenhunen H (2011) Compact generic intermediate representation (CGIR) to enable late binding in coarse grained reconfigurable architectures. In: Proceedings of the IEEE international conference on field-programmable technology (FPT), New Delhi, India, 12–14 December 2011, pp 1–6
Jasinski R (2007) Fault-tolerance techniques for SRAM-based FPGAs. Comput J 50(2):248
Piestrak SJ (1994) Design of residue generators and multioperand modular adders using carry-save adders. IEEE Trans Comput 43(1):68–77
Azeem MM, Piestrak SJ, Sentieys O, Pillement S (2011) Error recovery technique for coarse-grained reconfigurable architectures. In: Proceedings of the IEEE symposium on design and diagnostics of electronic circuits and systems (DDECS), Cottbus, Germany, 13–15 April 2011, pp 441–446
Ghofrani A-A, Parikh R, Shamshiri S, DeOrio A, Cheng K-T, Bertacco V (2012) Comprehensive online defect diagnosis in on-chip networks. In: Proceedings of the IEEE VLSI test symposium (VTS), pp 44–49
Sato T, Watanabe H, Shiba K (2005) Implementation of dynamically reconfigurable processor DAPDNA-2. In: Proceedings of the IEEE international symposium on VLSI design, automation and test (VLSI-TSA-DAT), Hsinchu, Taiwan, 27–29 April 2005, pp 323–324
Motomura M (2002) A dynamically reconfigurable processor architecture. In: Microprocessor forum, October 2002
Tajammul MA, Shami MA, Hemani A, Moorthi S (2011) NoC based distributed partitionable memory system for a coarse grain reconfigurable architecture. In: Proceedings of the 24th international conference on VLSI design (VLSI design), Chennai, India, 2–7 January 2011, pp 232–237
Tajammul MA, Jafri SMAH, Hemani A, Plosila J, Tenhunen H (2013) Private configuration environments for efficient configuration in CGRAs. In: Proceedings of the application specific systems architectures and processors (ASAP), Washington, DC, USA, 5–7 June 2013, pp 227–236
Tajammul MA, Shami MA, Hemani A (2012) Segmented bus based path setup scheme for a distributed memory architecture. In: Proceedings of the IEEE 6th international symposium on embedded multicore SoCs (MCSoC), Aizu-Wakamatsu, Japan, 20–22 September 2012, pp 67–74
Tunbunheng V, Suzuki M, Amano H (2005) RoMultiC: fast and simple configuration data multicasting scheme for coarse grain reconfigurable devices. In: Proceedings of the IEEE international conference on field-programmable technology (FPT), pp 129–136
Jafri SMAH, Hemani A, Paul K, Plosila J, Tenhunen H (2011) Compression based efficient and agile configuration mechanism for coarse grained reconfigurable architectures. In: Proceedings of the IEEE international symposium on parallel and distributed processing workshops and PhD forum (IPDPSW), Shanghai, China, 16–20 May 2011, pp 290–293
Jafri SMAH, Hemani A, Paul K, Plosila J, Tenhunen H (2011) Compact generic intermediate representation (CGIR) to enable late binding in coarse grained reconfigurable architectures. In: Proceedings of the international conference on field-programmable technology (FPT), New Delhi, India, 12–14 December 2011, pp 1–6
Jafri SMAH, Ozbak O, Hemani A, Farahini N, Paul K, Plosila J, Tenhunen H (2013) Energy-aware CGRAs using dynamically reconfigurable isolation cells DRICs. In: Proceedings of the international symposium on quality electronic design (ISQED), Santa Clara, CA, USA, 4–6 March 2013, pp 104–111
Jafri SMAH, Tajammul MA, Hemani A, Paul K, Plosila J, Tenhunen H (2013) Energy-aware-task-parallelism for efficient dynamic voltage, and frequency scaling, in CGRAs. In: Proceedings of the international conference on embedded computer systems: architectures, modeling, and simulation (SAMOS XIII), Agios Konstantinos, Greece, 15–18 July 2013, pp 104–112
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jafri, S.M.A.H., Piestrak, S.J., Hemani, A. et al. Private reliability environments for efficient fault-tolerance in CGRAs. Des Autom Embed Syst 18, 295–327 (2014). https://doi.org/10.1007/s10617-014-9129-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10617-014-9129-6