Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/IPDPS.2011.36guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU

Published: 16 May 2011 Publication History

Abstract

High performance and relatively low cost of GPU-based platforms provide an attractive alternative for general purpose high performance computing (HPC). However, the emerging HPC applications have usually stricter output cor-rectness requirements than typical GPU applications (i.e., 3D graphics). This paper first analyzes the error resiliency of GPGPU platforms using a fault injection tool we have devel-oped for commodity GPU devices. On average, 16-33% of in-jected faults cause silent data corruption (SDC) errors in the HPC programs executing on GPU. This SDC ratio is signifi-cantly higher than that measured in CPU programs (

Cited By

View all
  • (2024)Assessing the Impact of Compiler Optimizations on GPUs ReliabilityACM Transactions on Architecture and Code Optimization10.1145/363824921:2(1-22)Online publication date: 12-Jan-2024
  • (2022)GPU Devices for Safety-Critical Systems: A SurveyACM Computing Surveys10.1145/354952655:7(1-37)Online publication date: 15-Dec-2022
  • (2021)SUGARProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34473755:1(1-29)Online publication date: 22-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
IPDPS '11: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
May 2011
1285 pages
ISBN:9780769543857

Publisher

IEEE Computer Society

United States

Publication History

Published: 16 May 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Assessing the Impact of Compiler Optimizations on GPUs ReliabilityACM Transactions on Architecture and Code Optimization10.1145/363824921:2(1-22)Online publication date: 12-Jan-2024
  • (2022)GPU Devices for Safety-Critical Systems: A SurveyACM Computing Surveys10.1145/354952655:7(1-37)Online publication date: 15-Dec-2022
  • (2021)SUGARProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34473755:1(1-29)Online publication date: 22-Feb-2021
  • (2019)Bi-Source Verification Against Silent Data Corruption in High Performance ComputingProceedings of the 9th Balkan Conference on Informatics10.1145/3351556.3351567(1-4)Online publication date: 26-Sep-2019
  • (2018)Programmer-guided reliability for extreme-scale applicationsInternational Journal of High Performance Computing Applications10.1177/109434201666762532:5(598-612)Online publication date: 1-Sep-2018
  • (2018)LADRProceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3208040.3208043(156-167)Online publication date: 11-Jun-2018
  • (2018)Fault site pruning for practical reliability analysis of GPGPU applicationsProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00066(749-761)Online publication date: 20-Oct-2018
  • (2017)Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniquesProceedings of the International Conference on Supercomputing10.1145/3079079.3079093(1-10)Online publication date: 14-Jun-2017
  • (2016)FlipBackProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014943(1-12)Online publication date: 13-Nov-2016
  • (2016)Understanding error propagation in GPGPU applicationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014932(1-12)Online publication date: 13-Nov-2016
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media