Evaluating hardware reliability in the presence of soft errors
Evaluating hardware reliability in the presence of soft errors
Reliability has been a major concern in embedded systems. Higher transistor density and lower voltage supply increase the vulnerability of embedded systems to soft errors. A Single Event Upset (SEU), which is also called a soft error, can reverse a bit in a sequential element, resulting in a system failure. Simulation-based fault injection has been widely used to evaluate reliability, as suggested by ISO26262. However, it is practically impossible to test all faults for a complex design. Random fault injection is a compromise that reduces accuracy and fault coverage. Formal verification is an alternative approach. This research aims to utilize formal verification to evaluate the hardware reliability of a RISC-V Ibex Core in the presence of soft errors. We combine formal verification and fault injection, and perform backward tracing to identify and categorize faults according to fault effects (no effect, Silent Data Corruption, crash, and hang). With the help of formal verification, the entire state space and fault list can be exhaustively explored. We found that misaligned instructions can amplify fault effects. Apart from evaluating hardware reliability, the proposed method can help to determine a cost-effective fault protection strategy. We demonstrate how to use the method to formally evaluate the protection effectiveness of fault-tolerant technologies, for example, by identifying faults that can and cannot be detected/protected by fault-tolerant technologies. Formal verification, such as model checking, has limited applicability to Double Event Upsets (DEUs), due to infeasible runtime, proving efforts, and state explosion. We develop a DEU exploration strategy that significantly reduces model checking runtime and efforts to explore DEUs. We also mitigate state explosion using abstractions and proper constraints. As a consequence, we successfully scale our method to explore DEUs within an acceptable time. We found that DEUs can aggravate SEUs: a DEU consisting of two SEUs that cause no effect can cause a system failure.
University of Southampton
Xue, Bing
755777ee-472d-49a9-8cec-f6866530778c
October 2024
Xue, Bing
755777ee-472d-49a9-8cec-f6866530778c
Zwolinski, Mark
adfcb8e7-877f-4bd7-9b55-7553b6cb3ea0
Halak, Basel
8221f839-0dfd-4f81-9865-37def5f79f33
Xue, Bing
(2024)
Evaluating hardware reliability in the presence of soft errors.
University of Southampton, Doctoral Thesis, 235pp.
Record type:
Thesis
(Doctoral)
Abstract
Reliability has been a major concern in embedded systems. Higher transistor density and lower voltage supply increase the vulnerability of embedded systems to soft errors. A Single Event Upset (SEU), which is also called a soft error, can reverse a bit in a sequential element, resulting in a system failure. Simulation-based fault injection has been widely used to evaluate reliability, as suggested by ISO26262. However, it is practically impossible to test all faults for a complex design. Random fault injection is a compromise that reduces accuracy and fault coverage. Formal verification is an alternative approach. This research aims to utilize formal verification to evaluate the hardware reliability of a RISC-V Ibex Core in the presence of soft errors. We combine formal verification and fault injection, and perform backward tracing to identify and categorize faults according to fault effects (no effect, Silent Data Corruption, crash, and hang). With the help of formal verification, the entire state space and fault list can be exhaustively explored. We found that misaligned instructions can amplify fault effects. Apart from evaluating hardware reliability, the proposed method can help to determine a cost-effective fault protection strategy. We demonstrate how to use the method to formally evaluate the protection effectiveness of fault-tolerant technologies, for example, by identifying faults that can and cannot be detected/protected by fault-tolerant technologies. Formal verification, such as model checking, has limited applicability to Double Event Upsets (DEUs), due to infeasible runtime, proving efforts, and state explosion. We develop a DEU exploration strategy that significantly reduces model checking runtime and efforts to explore DEUs. We also mitigate state explosion using abstractions and proper constraints. As a consequence, we successfully scale our method to explore DEUs within an acceptable time. We found that DEUs can aggravate SEUs: a DEU consisting of two SEUs that cause no effect can cause a system failure.
Text
Evaluating_Hardware_Reliability_in_the_presence_of_Soft_Errors
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Bing-Xue
Restricted to Repository staff only
More information
Published date: October 2024
Identifiers
Local EPrints ID: 495097
URI: http://eprints.soton.ac.uk/id/eprint/495097
PURE UUID: 6e8c6fc8-6e3c-4736-ab7c-65ea4080b89c
Catalogue record
Date deposited: 29 Oct 2024 17:38
Last modified: 30 Oct 2024 02:55
Export record
Contributors
Author:
Bing Xue
Thesis advisor:
Mark Zwolinski
Thesis advisor:
Basel Halak
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics