Evaluation and mitigation of soft-errors in neural network-based object detection in three GPU architectures

FF dos Santos, L Draghetti, L Weigel… - 2017 47th Annual …, 2017 - ieeexplore.ieee.org
FF dos Santos, L Draghetti, L Weigel, L Carro, P Navaux, P Rech
2017 47th Annual IEEE/IFIP International Conference on Dependable …, 2017ieeexplore.ieee.org
In this paper, we evaluate the reliability of the You Only Look Once (YOLO) object detection
framework. We have exposed to controlled neutron beams GPUs designed with three
different architectures (Kepler, Maxwell, and Pascal) running Darknet, a Convolutional
Neural Network for automotive applications, detecting objects in both Caltech and Visual
Object Classes data sets. By analyzing the neural network corrupted output, we can
distinguish between tolerable errors and critical errors, ie, errors that could impact on real …
In this paper, we evaluate the reliability of the You Only Look Once (YOLO) object detection framework. We have exposed to controlled neutron beams GPUs designed with three different architectures (Kepler, Maxwell, and Pascal) running Darknet, a Convolutional Neural Network for automotive applications, detecting objects in both Caltech and Visual Object Classes data sets. By analyzing the neural network corrupted output, we can distinguish between tolerable errors and critical errors, i.e., errors that could impact on real-time system execution. Additionally, we propose an Algorithm-Based Fault-Tolerance (ABFT) strategy to apply to the matrix multiplication kernels of neural networks able to detect and correct 50% to 60% of radiation induced corruptions. We experimentally validate our hardening solution and compare its efficiency and efficacy with the available ECC.
ieeexplore.ieee.org