IEEE Transactions on Nuclear Science | 2019

Strategies for Removing Common Mode Failures From TMR Designs Deployed on SRAM FPGAs

 
 
 
 
 
 

Abstract


Triple modular redundancy (TMR) with repair has proven to be an effective strategy for mitigating the effects of single-event upsets within the configuration memory of static random access memory field-programmable gate arrays. Applying TMR to the design successfully reduces the design’s neutron cross section by <inline-formula> <tex-math notation= LaTeX >$80\\times $ </tex-math></inline-formula>. The effectiveness of TMR, however, is limited by the presence of single bits in the configuration memory which cause more than one TMR domain to fail simultaneously. We present three strategies to mitigate against these failures and improve the effectiveness of TMR: incremental routing, incremental placement, and striping. These techniques were tested using both fault injection and a wide spectrum neutron beam with the best technique offering a <inline-formula> <tex-math notation= LaTeX >$400\\times $ </tex-math></inline-formula> reduction to the design’s sensitive neutron cross section. An analysis from the radiation test shows that no single bits caused failure and that multicell upsets were the main cause of failure for these mitigation strategies.

Volume 66
Pages 207-215
DOI 10.1109/TNS.2018.2877579
Language English
Journal IEEE Transactions on Nuclear Science

Full Text