Li et al., 2006 - Google Patents
Exploiting soft computing for increased fault toleranceLi et al., 2006
View PDF- Document ID
- 7776439581109843827
- Author
- Li X
- Yeung D
- Publication year
External Links
Snippet
Traditionally, fault tolerance researchers have made very strict assumptions about program correctness. Such strict notions of correctness are appropriate for workloads that are numerically oriented. However, a growing number of important workloads produce results …
- 230000001965 increased 0 title description 8
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1405—Saving, restoring, recovering or retrying at machine instruction level
- G06F11/1407—Checkpointing the instruction stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3676—Test management for coverage analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
- G06F11/3608—Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/008—Reliability or availability analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/70—Fault tolerant, i.e. transient fault suppression
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Application-level correctness and its impact on fault tolerance | |
dos Santos et al. | Analyzing and increasing the reliability of convolutional neural networks on GPUs | |
Snir et al. | Addressing failures in exascale computing | |
Goloubeva et al. | Software-implemented hardware fault tolerance | |
Li et al. | Exploiting soft computing for increased fault tolerance | |
Khudia et al. | Efficient soft error protection for commodity embedded microprocessors using profile information | |
Borodin et al. | Protective redundancy overhead reduction using instruction vulnerability factor | |
Slijepcevic et al. | DTM: Degraded test mode for fault-aware probabilistic timing analysis | |
Ottavi et al. | Dependable multicore architectures at nanoscale: The view from europe | |
Fernandes dos Santos et al. | Kernel and layer vulnerability factor to evaluate object detection reliability in GPUs | |
Schirmeier | Efficient fault-injection-based assessment of software-implemented hardware fault tolerance | |
Thaker et al. | Characterization of error-tolerant applications when protecting control data | |
Di Mascio et al. | Open-source IP cores for space: A processor-level perspective on soft errors in the RISC-V era | |
Joshi et al. | Aloe: verifying reliability of approximate programs in the presence of recovery mechanisms | |
Liu et al. | Analyzing and increasing soft error resilience of Deep Neural Networks on ARM processors | |
Vera et al. | Selective replication: A lightweight technique for soft errors | |
Didehban et al. | Generic soft error data and control flow error detection by instruction duplication | |
Kong et al. | M2STaR: A Multimode Spatio-Temporal Redundancy Design for Fault-Tolerant Coarse-Grained Reconfigurable Architectures | |
Li et al. | Exploiting application-level correctness for low-cost fault tolerance | |
Sadi et al. | An efficient approach towards mitigating soft errors risks | |
Tajary et al. | IRHT: An SDC detection and recovery architecture based on value locality of instruction binary codes | |
Tonetto et al. | A knapsack methodology for hardware-based dmr protection against soft errors in superscalar out-of-order processors | |
Vassiliadis et al. | Artificial neural networks for online error detection | |
Reis III | Software modulated fault tolerance | |
Cong et al. | Impact of loop transformations on software reliability |