A Theoretical Framework for Symbolic Quick Error Detection
AA Theoretical Framework for SymbolicQuick Error Detection
Florian Lonsing, Subhasish Mitra, and Clark Barrett
Computer Science Department, Stanford University, Stanford, CA 94305, USAE-mail: {lonsing, subh, barrett}@stanford.edu
Abstract —Symbolic quick error detection (SQED) is a formalpre-silicon verification technique targeted at processor designs.It leverages bounded model checking (BMC) to check a de-sign for counterexamples to a self-consistency property: giventhe instruction set architecture (ISA) of the design, executingan instruction sequence twice on the same inputs must al-ways produce the same outputs. Self-consistency is a universal,implementation-independent property. Consequently, in contrastto traditional verification approaches that use implementation-specific assertions (often generated manually), SQED does notrequire a full formal design specification or manually-writtenproperties. Case studies have shown that SQED is effectivefor commercial designs and that SQED substantially improvesdesign productivity. However, until now there has been no formalcharacterization of its bug-finding capabilities. We aim to closethis gap by laying a formal foundation for SQED. We use atransition-system processor model and define the notion of a bugusing an abstract specification relation. We prove the soundnessof SQED, i.e., that any bug reported by SQED is in fact a real bugin the processor. Importantly, this result holds regardless of whatthe actual specification relation is. We next describe conditionsunder which SQED is complete, that is, what kinds of bugs it isguaranteed to find. We show that for a large class of bugs, SQEDcan always find a trace exhibiting the bug. Ultimately, we provefull completeness of a variant of SQED that uses specialized statereset instructions. Our results enable a rigorous understandingof SQED and its bug-finding capabilities and give insights onhow to optimize implementations of SQED in practice.
I. I
NTRODUCTION
Pre-silicon verification of HW designs given as models in aHW description language (e.g., Verilog) is a critical step in HWdesign. Due to the steadily increasing complexity of designs, itis crucial to detect logic design bugs before fabrication to avoidmore difficult and costly debugging in post-silicon validation.Formal techniques such as bounded model checking(BMC) [1] have an advantage over traditional pre-siliconverification techniques such as simulation in that they areexhaustive up to the BMC bound. Hence, formal techniquesprovide valuable guarantees about the correctness of a designunder verification (DUV) with respect to the checked properties.However, in traditional assertion-based formal verification tech-niques, these properties are implementation-specific and mustbe written manually based on expert knowledge about the DUV.Moreover, it is a well-known, long-standing challenge that setsof manually-written, implementation-specific properties mightbe insufficient to detect all bugs present in a DUV [2]–[6].
This work was supported by the Defense Advanced Research ProjectsAgency, grant FA8650-18-2-7854.
Article to appear in Proc. FMCAD 2020.
Symbolic quick error detection (SQED) [7]–[10] is a formalpre-silicon verification technique targeted at processor designs.In sharp contrast to traditional formal approaches, SQED doesnot require manually-written properties or a formal specificationof the DUV. Instead, it checks whether a self-consistency [11]property holds in the DUV. The self-consistency property em-ployed by SQED is universal and implementation-independent.Each instruction in the instruction set architecture (ISA) of theDUV is interpreted as a function in a mathematical sense. Theself-consistency check then amounts to checking whether theoutputs produced by executing a particular instruction sequencematch if the sequence is executed twice, assuming the inputsto the two sequences also match.SQED leverages BMC to exhaustively explore all possibleinstruction sequences up to a certain length starting from aset of initial states. Several case studies have demonstratedthat SQED is highly effective at producing short bug tracesby finding counterexamples to self-consistency in a variety ofprocessor designs, including industrial designs [9]. Moreover,SQED substantially increases verification productivity.However, until now there has been no rigorous theoretical un-derstanding of (A) whether counterexamples to self-consistencyfound by SQED always correspond to actual bugs in the DUV—the soundness of SQED—and (B) whether for each bug in theDUV there exists a counterexample to self-consistency thatSQED can find—the completeness of SQED. This paper makessignificant progress towards closing this gap.We model a processor as a transition system. This modelabstracts away implementation-level details, yet is sufficientlyprecise to formalize the workings of SQED. To prove soundnessand (conditional) completeness of SQED, we need to establisha correspondence between counterexamples to self-consistencyand bugs in a DUV. In our formal model we achieve thiscorrespondence by first defining the correctness of instructionexecutions by means of a general, abstract specification. Abug is then a violation of this specification. The abstractspecification expresses the following general and naturalproperty we expect to hold for actual DUVs: an instructionwrites a correct output value into a destination location anddoes not modify any other locations.As our main results , we prove soundness and conditionalcompleteness of SQED. For soundness, we prove that if SQEDreports a counterexample to the universal self-consistency prop-erty, then the processor has a bug. This result shows that SQEDdoes not produce spurious counterexamples. Importantly, this1 a r X i v : . [ c s . L O ] A ug esult holds regardless of the actual specification, confirmingthat SQED does not depend on such implementation-specificdetails. For completeness, we prove that if the processor has abug then, under modest assumptions, there exists a counterex-ample to self-consistency that can be found by SQED. We alsoshow that SQED can be made fully (unconditionally) completewith additional HW support in the form of specialized statereset instructions. Our results enable a rigorous understandingof SQED and its bug-finding capabilities in actual DUVs andprovide insight on how to optimize implementations of SQED.In the following, we first present an overview of SQEDfrom a theoretical perspective (Section II). Then we defineour transition system model of processors (Section III) andformalize the correctness of instruction executions in terms ofan abstract specification relation (Section IV). After establishinga correspondence between the abstract specification and theself-consistency property employed by SQED (Section V),we prove soundness and (conditional) completeness of SQED(Section VI). We conclude with a discussion of related workand future research directions (Sections VII and VIII).II. O VERVIEW OF
SQEDWe first informally introduce the basic concepts and termi-nology related to SQED. Fig. 1a shows an overview of thehigh-level workflow. Given a processor design P , i.e., the DUV,SQED is based on symbolic execution of instruction sequencesusing BMC. We assume that an instruction i = ( op , l, ( l (cid:48) , l (cid:48)(cid:48) )) consists of an opcode op , an output location l , and a pair ( l (cid:48) , l (cid:48)(cid:48) ) of input locations. Locations are an abstraction usedto represent registers and memory locations.The self-consistency check is based on executing two in-structions that should always produce the same result. The twoinstructions are called an original and a duplicate instruction ,respectively. The duplicate instruction has the same opcode asthe original one, i.e., it implements the same functionality, but itoperates on different input and output locations. The locationson which the duplicate instruction operates are determinedby an arbitrary but fixed bijective function L D : L O → L D between two subsets L O , the original locations , and L D , the duplicate locations , that form a partition of the set L of alllocations in P . An original instruction can only use locationsin L O . An instruction duplication function Dup then mapsany original instruction i O to its duplicate i D by copying theopcode and then applying L D to its locations. Example 1.
Let L = { , . . . , } be the identifiers of 32registers of a processor P , and consider the partition L O = { , , . . . , } and L D = { , , . . . , } . Let i O =( ADD , l , ( l , l )) be an original register-type ADD instructionoperating on registers , , and . Using L D ( k ) = k + 16 ,we obtain Dup ( i O ) = i D = ( ADD , l , ( l , l )) .Consider a different partition L (cid:48) O = { , , , . . . , } and L (cid:48) D = { , , , . . . , } and function L (cid:48) D ( k ) = k + 1 . For thisfunction, Dup ( i O ) = ( ADD , l , ( l , l )) . This model is used for simplicity, but it could easily be extended to allowinstructions with additional inputs or outputs.
Self-consistency checking is implemented using
QED tests .A QED test is an instruction sequence i = i O :: i D consistingof a sequence i O of n original instructions followed by a corre-sponding sequence i D = Dup ( i O ) of n duplicate instructions(where operator “::” denotes concatenation). A QED test i issymbolically executed from a QED-consistent state , that is, astate where the value stored in each original location l is thesame as the value stored in its corresponding duplicate location L D ( l ) . The resulting final state after executing i should thenalso be QED-consistent. Fig. 1a illustrates the workflow. AQED test i succeeds if the final state that results from executing i is QED-consistent; otherwise it fails . Starting the executionin a QED-consistent state guarantees that original and duplicateinstructions receive the same input values. Thus, if the finalstate is not QED-consistent, then this indicates that some pairof original and duplicate instructions behaved differently. Example 2.
Consider Fig. 1b and the QED test i = i O :: i D consisting of one original instruction i O and its duplicate Dup ( i O ) = i D for some function L D . Suppose that i is exe-cuted in a QED-consistent state s (denoted by QEDcons ( s ) and s ( L O ) = s ( L D ) ) and both i O and i D execute correctly.Instruction i O produces state s , where the values at duplicatelocations remain unchanged, i.e., s ( L D ) = s ( L D ) , because i O operates on original locations only. When instruction i D isexecuted in state s , it modifies only duplicate locations. Thefinal state s is QED-consistent (denoted by QEDcons ( s ) and s ( L O ) = s ( L D ) ), and thus QED test i succeeds. Example 3 (Bug Detection) . Consider processor P and L O and L D from Example 1. Let i O, = ( ADD , l , ( l , l )) and i O, = ( MUL , l , ( l , l )) be original register-typeaddition and multiplication instructions. Using L D ( k ) = k +16 ,we obtain Dup ( i O, ) = i D, = ( ADD , l , ( l , l )) and Dup ( i O, ) = i D, = ( MUL , l , ( l , l )) . Assume that P has a bug that is triggered when two MUL instructions areexecuted in subsequent clock cycles, resulting in the corruptionof the output location of the second MUL instruction. Notethat executing the QED test i = i O, , i O, :: i D, , i D, ina QED-consistent initial state produces a QED-consistentfinal state: the bug is not triggered by i because i D, is executed between i O, and i D, . A slightly longer test i = i O, , i O, , i O, :: i D, , i D, , i D, does trigger the bug,however, because the subsequence i O, , i D, of two back-to-back MULs causes the first duplicate instruction i D, in i toproduce an incorrect result at l . This incorrect result thenpropagates through the next two instructions, resulting in aQED-inconsistent final state since the values at l and l ,i.e., the output locations of i O, and i D, , differ. QED-consistency is the universal, implementation-indepen-dent property that is checked in SQED. In practice, the propertymust refer to some basic information about the design suchas, e.g., symbolic register names, but this can be generatedautomatically from a high-level ISA description [10]. BMC This scenario corresponds to a real bug in an out-of-order RISC-V designdetected by SQED: https://github.com/ridecore/ridecore/issues/4. a) (b)Fig. 1. SQED workflow from a theoretical perspective (a) and illustration of executing the QED test i = i O :: i D in Example 2 (b). is used to symbolically and exhaustively generate all possibleQED tests up to a certain length n (the BMC bound). BMCensures that SQED will find the shortest possible failing QEDtest first. The high-level workflow shown in Fig. 1a allowsfor flexibility in choosing the partition and mapping betweenoriginal and duplicate locations. We rely on this flexibility forthe results in this paper (Theorems 1 and 2). Current SQEDimplementations use a predefined partition and mapping, basedon which BMC enumerates all possible QED tests. Extendingimplementations to have the BMC tool also choose a partitionand mapping could be explored in future work.We refer to related work [7], [9], [12] for case studiesthat demonstrate the effectiveness of BMC-based SQED ona variety of processor designs. The scalability of SQED inpractice is determined by the scalability of the BMC tool beingused. Thus, approaches for improving scalability of BMC canalso be applied to SQED, e.g. abstraction, decomposition, andpartial instantiation techniques [7].III. I NSTRUCTION AND P ROCESSOR M ODEL
We model a processor as a transition system containing anabstract set of locations. The set of locations includes registersand memory locations. A state of a processor consists of an architectural and a non-architectural part. In a state transitionthat results from executing an instruction, the architectural partof a state is modified explicitly by updating the value at theoutput location of the executed instruction. The architecturalpart of a state is also called the software-visible state of theprocessor. It comprises those parts of the state that can beupdated by executing instructions of the user-level ISA ofthe processor, such as memory locations and general-purposeregisters. The non-architectural part of a state comprises theremaining parts that are updated only implicitly by executingan instruction, such as pipeline or status registers.Instructions are functions that take inputs from locations andwrite an output to a location. We assume that every instructionproduces its result in one transition. In our model, we abstractaway implementation details of complex processor designs (e.g., pipelined, out-of-order, multi-processor systems). This isfor ease of presentation and reasoning. However, many of thesecomplexities can be viewed as refinements of our abstraction,meaning that our formal results still hold on complex models(i.e., our results can be lowered to more detailed models suchas those described in [7], [8]). Working out the details of suchrefinements is one important avenue for future work.
Definition 1 (Transition System) . A processor is a transitionsystem [13], [14] P = ( V , L , S a , s a ,I , Op , I, T ) , where • V is a set of abstract data values , • L is a set of memory locations (from which we define theset S a of architectural states as the set of total functionsfrom locations to values, i.e. S a = { s a | s a : L → V} ), • S a is a set of non-architectural states (from which wefurther define the set of all states as S = S a × S a ), • s a ,I ∈ S a is a unique initial non-architectural state (fromwhich we define the set of initial states as S I = S a ×{ s a ,I } , • Op is a set of operation codes (opcodes) , • I = Op × L × L is the set of instructions , and • T : S × I → S is the transition function , which is total. A state s ∈ S with s = ( s a , s a ) consists of an architecturalpart s a ∈ S a and a non-architectural part s a ∈ S a . In thearchitectural part s a : L → V , L represents all possibleregisters and memory locations, i.e., in practical terms, L is the address space of P . An initial state s I ∈ S I with s I = ( s a , s a ,I ) is defined by a unique non-architectural part s a ,I ∈ S a and an arbitrary architectural part s a ∈ S a . Weassume that s a ,I ∈ S a is unique to make the exposition simpler.Our model could easily be extended to a set of initial non-architectural states. The number |L| of memory locations isarbitrary but fixed. We write v = s ( l ) to denote the value v = s a ( l ) at location l ∈ L in state s = ( s a , s a ) . We alsowrite ( v, v (cid:48) ) = s ( l, l (cid:48) ) as shorthand for v = s ( l ) and v (cid:48) = s ( l (cid:48) ) .To formally define instruction duplication, we need to reasonabout original and duplicate memory locations. To this end,we partition the set L of memory locations into two sets3f equal size, the original and duplicate locations L O and L D , respectively, i.e., L O ∩ L D = ∅ , L O ∪ L D = L , and |L O | = |L D | . Given L O and L D , we define an arbitrary butfixed bijective function L D : L O → L D that maps an originallocation l O ∈ L O to its corresponding duplicate location l D = L D ( l O ) . The inverse of L D is denoted by L D − and is uniquelydefined. We write ( l D , l (cid:48) D ) = L D ( l O , l (cid:48) O ) as shorthand for l D = L D ( l O ) and l (cid:48) D = L D ( l (cid:48) O ) . Function L D implements acorrespondence between original and duplicate locations, whichwe need to define QED-consistency (Definition 11 below).An instruction i ∈ I with i = ( op , l, ( l (cid:48) , l (cid:48)(cid:48) )) is defined byan opcode op ∈ Op , an output location l ∈ L , and a pair ofinput locations ( l (cid:48) , l (cid:48)(cid:48) ) ∈ L . Function op : I → Op maps aninstruction to its opcode op ( i ) . Functions L out : I → L and L in : I → L map an instruction i to its output and inputlocations L out ( i ) = l and L in ( i ) = ( l (cid:48) , l (cid:48)(cid:48) ) , respectively. Givena state s = ( s a , s a ) , instruction i reads values in s from itsinput locations L in ( i ) and writes a value to its output location L out ( i ) , resulting in a transition to a new state s (cid:48) = ( s (cid:48) a , s (cid:48) a ) ,written as s (cid:48) = T ( s, i ) . The transition function T is total, i.e.,for every instruction i and state s , there exists a successor state s (cid:48) = T ( s, i ) . As mentioned above, we have kept the modelsimple in order to make the presentation more accessible, butour results can be lifted to many extensions, including, e.g.,more complicated kinds of instructions or instructions withenabledness conditions cf. [15].We write i ∈ I n and s ∈ S n to denote sequences i = (cid:104) i , . . . , i n (cid:105) and s = (cid:104) s , . . . , s n (cid:105) of n instructions and n states, respectively. We will use :: for sequence concatenationand extend the transition function T to sequences as follows. Definition 2 (Path) . Given sequences i = (cid:104) i , . . . , i n (cid:105) and s = (cid:104) s , . . . , s n (cid:105) of n instructions and states, s is a path fromstate s ∈ S to s n via i , written s = T ( s , i ) , iff (cid:86) n − k =0 s k +1 = T ( s k , i k +1 ) . If s = T ( s , i ) , then for convenience we also write s n = T ( s , i ) to denote the final state s n . Definition 3 (Reachable State) . A state s is reachable , written reach ( s ) , iff s = T ( s , i ) for some s ∈ S I and instructionsequence i . The set I of instructions contains as proper subsets the setsof original and duplicate instructions , I O and I D , respectively.Original (duplicate) instructions operate only on original (dupli-cate) locations, i.e., ∀ i O ∈ I O . L in ( i O ) ∈ L O ∧ L out ( i O ) ∈ L O and ∀ i D ∈ I D . L in ( i D ) ∈ L D ∧ L out ( i D ) ∈ L D . Given thesedefinitions, we formalize instruction duplication as follows. Definition 4 (Instruction Duplication) . Let
Dup : I O → I D be an instruction duplication function that maps an originalinstruction i O = ( op , l O , ( l (cid:48) O , l (cid:48)(cid:48) O )) to a duplicate instruction i D = Dup ( i O ) = ( op , L D ( l O ) , L D ( l (cid:48) O , l (cid:48)(cid:48) O )) with respect tothe bijective function L D . An original instruction and its duplicate have the same opcode.We write i O ∈ I nO and i D ∈ I nD to denote sequences i O = (cid:104) i O, , . . . , i O,n (cid:105) and i D = (cid:104) i D, , . . . , i D,n (cid:105) of n original and duplicate instructions, respectively. We lift Dup in the naturalway also to sequences of instructions as follows.
Definition 5 (Instruction Sequence Duplication) . Let i O = (cid:104) i O, , . . . , i O,n (cid:105) be a sequence of original instructions. Then
Dup ( i O ) = (cid:104) Dup ( i O, ) , . . . , Dup ( i O,n ) (cid:105) . IV. F
ORMALIZING C ORRECTNESS
We formalize the correctness of instruction executions in aprocessor P using an abstract specification relation. We thenlink this abstract specification to QED-consistency, the self-consistency property employed by SQED (Section V below).For our formalization, we assume that every opcode op ∈ Op has a specification function Spec op : V → V that specifieshow the opcode computes an output value from input values.Using this family of functions, we define an overall abstractspecification relation Spec ⊆ S × I × S , which expresses whenan instruction i ∈ I can transition to a state s (cid:48) ∈ S from astate s ∈ S while respecting the opcode specification. Definition 6 (Abstract Specification) . ∀ s, s (cid:48) ∈ S, i ∈ I. Spec ( s, i, s (cid:48) ) ↔ ∀ l ∈ L . ( l (cid:54) = L out ( i ) → s ( l ) = s (cid:48) ( l )) ∧ (1) ( l = L out ( i ) → s (cid:48) ( l ) = Spec op ( i ) ( s ( L in ( i )))) Equation (1) states general and natural properties that weexpect to hold for a processor P . If an instruction i executesaccording to its specification, then the values at locations thatare not output locations of i are unchanged. Additionally, thevalue produced at the output location of the instruction mustagree with the value specified by function Spec op ( i ) . Notethat the specification relation Spec specifies only how thearchitectural part of a state is updated by a transition (not thenon-architectural part). Consequently, there might exist multiplestates whose non-architectural parts satisfy the right-hand sideof (1). This is why
Spec is a relation rather than a function.As special cases of (1), original and duplicate instructions havethe following properties: ∀ s, s (cid:48) ∈ S, i O ∈ I O , l O ∈ L O , i D ∈ I D , l D ∈ L D . ( Spec ( s, i O , s (cid:48) ) → s ( l D ) = s (cid:48) ( l D )) ∧ (2) ( Spec ( s, i D , s (cid:48) ) → s ( l O ) = s (cid:48) ( l O )) (3)Equations (2) and (3) express that the execution of an original(duplicate) instruction does not change the values at duplicate(original) locations if the instruction executes according to itsspecification. The following functional congruence property ofinstructions also follows from (1): ∀ s , s , s (cid:48) , s (cid:48)(cid:48) ∈ S, i, i (cid:48) ∈ I. (cid:2) op ( i ) = op ( i (cid:48) ) ∧ Spec ( s , i, s (cid:48) ) ∧ Spec ( s , i (cid:48) , s (cid:48)(cid:48) ) ∧ (4) s ( L in ( i )) = s ( L in ( i (cid:48) )) (cid:3) → s (cid:48) ( L out ( i )) = s (cid:48)(cid:48) ( L out ( i (cid:48) )) By functional congruence, if two instructions with the sameopcode are executed on inputs with the same values, then theoutput values are the same. We next define the correctness ofa processor P based on the abstract specification Spec .4 efinition 7 (Correctness) . A processor P is correct withrespect to specification Spec iff ∀ i ∈ I, s ∈ S. reach ( s ) → Spec ( s, i, T ( s, i )) . Correctness requires every instruction to execute according tothe abstract specification
Spec in every reachable state of P .A bug in P is a counterexample to correctness, i.e., aninstruction that fails in at least one (not necessarily initial)reachable state and may or may not fail in other states. Definition 8 (Bug) . A bug with respect to specification Spec in a processor P is defined by a pair B = (cid:104) i b , S b (cid:105) consistingof an instruction i b ∈ I and a non-empty set S b ⊆ S of statessuch that S b = { s ∈ S | reach ( s ) ∧ ¬ Spec ( s, i b , T ( s, i b )) } . The above definitions rely on the notion of an abstractspecification relation. Having some abstract specification is a theoretical construct that is necessary to formally characterizeinstruction failure and establish formal proofs about SQED.However, it is important to note that to apply SQED in practice ,we do not need to know what the abstract specification relationis.A bug (cid:104) i b , S b (cid:105) is precisely characterized by the set S b of allreachable states in which i b fails. The following propositionfollows from Definitions 7 and 8. Proposition 1.
A processor P has a bug with respect tospecification Spec iff it is not correct with respect to
Spec . As special cases of processor correctness and bugs, re-spectively, we define correctness and bugs with respect toinstructions that are executed in an initial state only.
Definition 9 (Single-Instruction Correctness) . Processor P is single-instruction correct iff: ∀ i ∈ I, s ∈ S I . Spec ( s , i, T ( s , i )) . Single-instruction correctness implies that all instructions, i.e.,all opcodes and all combinations of input and output locations,execute correctly in all initial states. A single-instruction bug is a counterexample to single-instruction correctness.
Definition 10 (Single-Instruction Bug) . Processor P has a single-instruction bug with respect to specification Spec iff ∃ i ∈ I, s ∈ S I . ¬ Spec ( s , i, T ( s , i )) . Several approaches exist for single-instruction checking of aprocessor, which is complementary to SQED (cf. Section VII).V. S
ELF -C ONSISTENCY AS
QED-C
ONSISTENCY
We now define QED-consistency (cf. Section II) as a propertyof states of a processor P based on function L D . Then weformally define the notion of QED test and show that forcorrect processors, QED tests preserve QED-consistency. Thisresult is key to the proof of the soundness in Section VI below. Definition 11 (QED-Consistency) . A state s is QED-consistent ,written
QEDcons ( s ) , iff ∀ l O ∈ L O . s ( l O ) = s ( L D ( l O )) . QED-consistency is based on checking the architectural partof a state. An equivalent condition can be formulated in termsof duplicate locations: ∀ l D ∈ L D . s ( l D ) = s ( L D − ( l D )) . Definition 12 (QED test) . An instruction sequence i is a QEDtest if i = i O :: Dup ( i O ) for some sequence i O of originalinstructions. We link the abstract specification
Spec to the semanticsof original and duplicate instructions. This way, we obtaina notion of functional congruence that readily follows as aspecial case from (4).
Corollary 1 (Functional Congruence: Duplicate Instructions) . Given i O ∈ I O and i D ∈ I D with i D = Dup ( i O ) , the followingholds for all states s , s , s (cid:48) , and s (cid:48)(cid:48) : (cid:2) Spec ( s , i O , s (cid:48) ) ∧ Spec ( s , i D , s (cid:48)(cid:48) ) ∧ s ( L in ( i O )) = s ( L D ( L in ( i O ))) (cid:3) → s (cid:48) ( L out ( i O )) = s (cid:48)(cid:48) ( L D ( L out ( i O ))) Corollary 1 states that an original instruction i O produces thesame value at its output location as its duplicate instruction i D = Dup ( i O ) , provided that these instructions execute instates where the values at the respective input locations match.We generalize Corollary 1 to show that after executinga pair of original and duplicate instructions, the values at all original locations match the values at the correspondingduplicate locations, assuming those values also matched beforeexecuting the instructions. Lemma 1 (cf. Corollary 1) . Given i O ∈ I O and i D ∈ I D with i D = Dup ( i O ) , the following holds for all states s , s , s (cid:48) ,and s (cid:48)(cid:48) : (cid:2) Spec ( s , i O , s (cid:48) ) ∧ Spec ( s ,i D , s (cid:48)(cid:48) ) ∧∀ l O ∈ L O . s ( l O ) = s ( L D ( l O )) (cid:3) →∀ l O ∈ L O . s (cid:48) ( l O ) = s (cid:48)(cid:48) ( L D ( l O )) Proof.
See appendix.Lemma 1 leads to an important result that we need to provesoundness of SQED (Lemma 3 below): executing a QED test i starting in a QED-consistent state results in a QED-consistentfinal state if all instructions in i execute according to theabstract specification Spec (cf. Fig. 1b).
Lemma 2 (QED-Consistency and QED tests) . Let i = (cid:104) i , . . . , i n (cid:105) be a QED test, let (cid:104) s , . . . , s n (cid:105) be a sequenceof n + 1 states, and let Spec be some abstract specificationrelation. Then,
QEDcons ( s ) ∧ (cid:0) n − (cid:94) j :=0 Spec ( s j , i j +1 , s j +1 ) (cid:1) → QEDcons ( s n ) Proof.
Assuming the antecedent, let l O ∈ L O be arbitrary butfixed with l D = L D ( l O ) . By repeated application of (2), wederive s ( l D ) = s ( l D ) = . . . = s n ( l D ) , and hence: s ( l D ) = s n ( l D ) (5)by transitivity. By repeated application of (3), we derive: s n ( l O ) = s n ( l O ) (6)5ow, QEDcons ( s ) implies s ( l O ) = s ( L D ( l O )) , fromwhich it follows by (5) that s ( l O ) = s n ( L D ( l O )) . Byrepeated application of Lemma 1, we can next derive s j ( l O ) = s n + j ( L D ( l O )) for ≤ j ≤ n , and in particular, s n ( l O ) = s n ( L D ( l O )) . Finally, by applying (6), we get s n ( l O ) = s n ( L D ( l O )) . Since l O was chosen arbitrarily, QEDcons ( s n ) holds.VI. S OUNDNESS AND C ONDITIONAL C OMPLETENESS
SQED checks a processor P for self-consistency by execut-ing QED tests and checking QED-consistency (cf. Fig 1a). Wenow define the correctness of P in terms of QED tests that,when executed, always result in QED-consistent states. Thisway, we establish a correspondence between counterexamplesto QED-consistency and bugs in P . We then prove our mainresults (Theorem 1) related to the bug-finding capabilities ofSQED, i.e., soundness and conditional completeness. Definition 13 (Failing and Succeeding QED Tests) . Let i bea QED test, s ∈ S I an initial state such that QEDcons ( s ) holds, and let s = T ( s , i ) . We say that: • QED test i fails if ¬ QEDcons ( s ) . • QED test i succeeds if QEDcons ( s ) . Definition 14 (Processor QED-Consistency) . A processor P is QED-consistent if all possible QED tests succeed.
Definition 15 (Processor QED-Inconsistency) . A processor P is QED-inconsistent if some QED test fails.
Lemma 3.
Let P be a processor. If P is QED-inconsistent,then P is not correct with respect to any abstract specificationrelation.Proof. Let i be a failing QED test for P and assume that proces-sor P is correct with respect to some abstract specification rela-tion Spec . By Lemma 2, we conclude
QEDcons ( s n ) , whichcontradicts the assumption that i is a failing QED test.Importantly, Lemma 3 holds regardless of what the actualspecification relation Spec is, i.e., it is independent of
Spec and the opcode specification function
Spec op (Definition 6).Lemma 3 shows that SQED is a sound technique: any errorreported by a failing QED test is in fact a real bug in the system.It is more challenging to determine the degree to which SQEDis complete , that is, for which bugs do there exist failing QEDtests? We address this question next.Suppose that B = (cid:104) i b , S b (cid:105) is a bug with respect to a specifica-tion Spec in a processor P , where i b = ( op b , l bout , ( l bin , l bin )) .A bug-specific QED test for B is a QED test that sets upthe conditions for and includes the activation of the bug. ByDefinition 8, if i b is executed in P starting from any statein S b , the specification is violated. That is, for each s b ∈ S b , ¬ Spec ( s b , i b , T ( s b , i b )) . Let s = T ( s b , i b ) . According to (1),there are two ways the specification can be violated. Either:(A) the value in the output location of i b is different from thatrequired by Spec , i.e.: s ( l bout ) (cid:54) = Spec op b ( s b ( l bin ) , s b ( l bin )) ,which we call a type-A bug ; or (B) the value in some other, non-output location l bad is not preserved, i.e.: s ( l bad ) (cid:54) = s b ( l bad ) for some l bad (cid:54) = l bout , which we call a type-B bug . We nowdefine a bug-specific QED test formally. Definition 16 (Bug-Specific QED Test) . Let B = (cid:104) i b , S b (cid:105) be a bug in P with respect to Spec , where i b = ( op b , l bout , ( l bin , l bin )) . The instruction sequence i = (cid:104) i , . . . , i n , i n +1 , . . . , i n (cid:105) is a bug-specific QED test for B ifthe following conditions hold: i n +1 = i b . i is a QED test for some L D , i.e. for ≤ k ≤ n , i n + k = Dup ( i k ) . In particular, i = ( op b , l out , ( l in , l in )) , with ( l in , l in , l out ) = L − D (( l bin , l bin , l bout )) . There exists a path s ∈ S n from s ∈ S I with QEDcons ( s ) , such that s = T ( s , i ) = (cid:104) s , . . . , s n , s n +1 , . . . , s n (cid:105) , where s n ∈ S b . Spec ( s , i , s ) . Additionally, we need three more conditions that dependon the bug types:
Case A: If i b is a type-A bug with respect to s n , i.e. s n +1 ( l bout ) (cid:54) = Spec op b ( s n ( l bin ) , s n ( l bin )) , then let l orig = l out and l dup = l bout . • We then require: – s n +1 ( l dup ) = s n ( l dup ) , – s ( l orig ) = s n ( l orig ) , – s ( L in ( i b )) = s n ( L in ( i b )) . Case B: If i b is a type-B bug with respect to s n , i.e. s n ( l bad ) (cid:54) = s n +1 ( l bad ) for some l bad (cid:54) = l bout , then let l orig = L − D ( l bad ) with l orig (cid:54) = l out and l dup = l bad . • We then require: – s n +1 ( l dup ) = s n ( l dup ) , – s ( l orig ) = s n ( l orig ) . – s ( l dup ) = s n ( l dup ) , Clearly, it is always possible to satisfy the first two conditionsby declaring the buggy instruction i b to be the duplicate of i with respect to some function L D . Moreover, if we restrict ourattention to single-instruction correct processors, then the fourthcondition always holds as well. This fits in well with the statedintended role of SQED which is to find sequence-dependentbugs, rather than single-instruction bugs.Understanding when the remaining conditions 3 and 5 holdis more complicated. We must find some instruction sequence i ∗ = (cid:104) i . . . i n (cid:105) that can transition P from the state s followingthe execution of i to one of the bug-triggering states in S b ,i.e., s n . Often it is reasonable to assume that P is stronglyconnected , i.e., that there always exists an instruction sequencethat can transition from one reachable state to another. This isalmost enough to ensure the existence of i ∗ . However, thereare a few other restrictions on i ∗ to satisfy Definition 16.First, i ∗ must consist of only original instructions to satisfythe definition of a QED test. We are free to choose L D to beanything that works, so the main restriction is that i ∗ cannotuse any instructions referencing locations that are used by i b ,i.e., l bin , l bin , or l bout . Note that we defined i n +1 = i b to bethe first duplicate instruction. This ends up being the mostsevere restriction on i ∗ because it means that instructions in i ∗ i b . We discusssome mitigations to this restriction in Section VI-A.Somewhat surprisingly, the three requirements in condition 5are not very severe, as we now explain. For both type-A andtype-B bugs, locations l orig and l dup are an original locationand its duplicate, respectively, that will hold inconsistent valueswhen the QED test i fails. For type-A bugs, l orig holds thecorrect output value of i and l dup holds the incorrect outputvalue of i b . For type-B bugs, l dup holds the value of location l bad that is incorrectly modified when i b is executed in state s n ,and l orig is the original location that corresponds to l dup = l bad .The first requirement s n +1 ( l dup ) = s n ( l dup ) means thatthe duplicate sequence Dup ( i ∗ ) of i ∗ in the QED test has topreserve the value of l dup in s n +1 also in the final state s n .Further, since l orig = L − D ( l dup ) , this also imposes restrictionson the modifications that i ∗ can make to l orig . However, as thisis just one original location, it is unlikely that every possible i ∗ would need to modify it to get to some bug-triggering state s n .The second requirement is s ( l orig ) = s n ( l orig ) . For similarreasons, it is unlikely that i ∗ would need to modify l orig , andthe duplicate sequence Dup ( i ∗ ) of i ∗ should not modify iteither, since it is an original location and original locationsshould be left alone by duplicate instructions. Although thebuggy instruction i b might modify l orig if it has more thanone bug effect, we may be able to choose the locations of i and L D differently to avoid this.Finally, the last requirement of condition 5 depends on thetwo cases A and B. In both cases, we require that i ∗ does notmodify certain duplicate locations: the input locations L in ( i b ) of i b (A) and location l dup that is incorrectly modified by i b (B). Sequence i ∗ should not modify any duplicate locations asit is composed of original instructions. Note that we do nothave to make the strong assumption that i ∗ executes accordingto its specification, only that it avoids corrupting a few keylocations. Given that we have a lot of freedom in choosing L D and hence the locations of i , these requirements are likely tobe satisfiable if there are some degrees of freedom in choosinga path to one of the bug-triggering states.We now prove our conditional completeness property, namelythat if a bug-specific QED test i exists, then i fails. Lemma 4.
Let P be a processor with a bug B = (cid:104) i b , S b (cid:105) with respect to specification Spec , for which there exists abug-specific QED test i . Then i fails.Proof. Let B = (cid:104) i b , S b (cid:105) be a bug and i be a bug-specific QED test for B . By Definition 16 we have i = (cid:104) i , . . . , i n , i n +1 , . . . , i n (cid:105) and s = T ( s , i ) = (cid:104) s , s , . . . , s n , s n +1 , . . . , s n (cid:105) , where s n ∈ S b and i b = i n +1 ,and QEDcons ( s ) holds. We show that ¬ QEDcons ( s n ) holds by showing that s n ( l orig ) (cid:54) = s n ( l dup ) . We distinguishthe two cases A and B in Definition 16. Case A.
Since
QEDcons ( s ) and Dup ( i ) = i b , we have s ( L in ( i )) = s ( L in ( i b )) (7) From the third requirement of Case A in Definition 16, wehave s ( L in ( i b )) = s n ( L in ( i b )) , so it follows that, s ( L in ( i )) = s n ( L in ( i b )) (8)By (8) and since op ( i ) = op ( i b ) , also Spec op ( i ) ( s ( L in ( i ))) = Spec op ( i b ) ( s n ( L in ( i b ))) (9)Since Spec ( s , i , s ) by Definition 16, we have s ( L out ( i )) = Spec op ( i ) ( s ( L in ( i ))) (10)Since we are in Case A, we have from Definition 16 that l orig = L out ( i ) , and from the second requirement of Case A,we have s ( l orig ) = s n ( l orig ) , so it follows that, s n ( l orig ) = Spec op ( i ) ( s ( L in ( i ))) (11)Since i b fails in state s n , we have that, s n +1 ( L out ( i b )) (cid:54) = Spec op ( i b ) ( s n ( L in ( i b ))) (12)Again, from Case A in Definition 16, we have l dup = L out ( i b ) , and from the first requirement of Case A, we have s n +1 ( l dup ) = s n ( l dup ) , so it follows that, s n ( l dup ) (cid:54) = Spec op ( i b ) ( s n ( L in ( i b ))) (13)Finally, (9) and (11) give us, s n ( l orig ) = Spec op ( i b ) ( s n ( L in ( i b ))) (14)But then (13) and (14) imply s n ( l orig ) (cid:54) = s n ( l dup ) , andhence ¬ QEDcons ( s n ) . Case B.
See appendix.
Theorem 1. • SQED is sound (Lemma 3). • SQED is complete for bugs for which a bug-specific QEDtest exists (Lemma 4).
Theorem 1 is relevant for practical applications of SQED.Referring to the high-level workflow shown in Fig. 1a, BMCsymbolically explores all possible QED tests up to bound n for a particular fixed mapping L D . If a failing QED test i isfound, then by the soundness of SQED, i corresponds to abug in the processor. By completeness, if there exists a bug forwhich a bug-specific QED test i exists, then with a sufficientlylarge bound n , BMC will find a sequence i that will fail. A. Extensions
We now consider variants of QED tests that cover a largerclass of bugs (i.e. bugs that cannot be detected by a bug-specificQED test). Ultimately, with hardware support we obtain afamily of QED tests which, together with single-instructioncorrectness, results in a complete variant of SQED (Theorem 2).The main limitation of bug-specific QED tests arises from thefact that QED tests consist of a sequence of original instructionsfollowed by duplicate ones. This makes it impossible to set upa bug-specific QED test for an important class of forwarding-logic bugs (a simple refinement of our model can be used for theimportant case of pipelined systems). To see why, consider that7 bug-triggering state s n ∈ S b must be reached by executing asequence of original instructions. The buggy instruction, whichis a duplicate , is executed in state s n and would have to reada value from some original location written previously.To resolve this limitation, first note that there is anotherway that SQED can find bugs, namely by finding QED testsfor which the bug occurs during the original sequence, butnot during the duplicate one. This kind of QED test is muchmore effective with a simple extension to allow no-operationinstructions (a trick also employed in [11]). To formalize this,we first define a set N of no-operation instructions (NOPs). Definition 17.
Let N be the set of instructions such that, forevery state ( s a , s a ) , if i nop ∈ N , then T (( s a , s a ) , i nop ) =( s a , s (cid:48) a ) for some s (cid:48) a ∈ S a . An instruction in N may change the non-architectural part ofa state, but not the architectural part. Definition 18. An extended QED test is any sequence ofinstructions obtained from a standard QED test by insertingzero or more instructions from N anywhere in the sequence. Extended QED tests enjoy the same properties as standardQED tests. In particular, an appropriately lifted version ofLemma 2 holds and the notions of failing and succeeding QEDtests can be lifted to extended QED tests in the obvious way.
Definition 19 (Bug-Hunting Extended QED Test) . Let P bea single-instruction correct processor with at least one bug.The instruction sequence i is a bug-hunting extended QEDtest with a bug-prefix of size k and initial state s for P ifthe following conditions hold: There is some bug B = (cid:104) i b , S b (cid:105) in P such that T ( s , (cid:104) i , . . . , i k − (cid:105) ) ∈ S b and i k = i b i is an extended QED test i k is an original instruction, and i k +1 = Dup ( i ) Unlike a bug-specific QED test, a bug-hunting extended QEDtest is not guaranteed to fail. It starts with a bug-triggeringsequence of length k , and then finishes with a modifiedduplicate sequence which may add (or subtract) NOPs from N . The NOPs can be used to change the timing betweenany interdependent instructions, making it more likely that theduplicate sequence will produce a correct result, especially ifthe bug depends on forwarding-logic. One can show (omittedfor lack of space) that for a general class of forwarding-logicbugs, there does always exist an extended QED test that fails.Another QED test extension is to allow original and duplicateinstructions to be interleaved [10], rather than requiring thatall original instructions precede all duplicate instructions [8]. Again, it is straightforward to show that this extension preservesLemma 2. Clearly, the set of bugs that can be found by adding The bug in Example 3 can be detected by executing the QED test i = i O, , i D, :: i O, , i D, , which interleaves original and duplicate instructions.The subsequence i O, , i D, of two back-to-back MULs causes i D, to producean incorrect result at its output location l . The final state is QED-inconsistentsince the output location l of i O, holds the correct value, while l holdsan incorrect one. interleaving are a strict superset of those that can be foundwithout. In practice, implementations of SQED search for allpossible extended QED tests with interleaving. Empirically,case studies have not turned up any (non-single-instruction)bugs that cannot be found with this combination. However,one can construct pathological systems with bugs that cannotbe found by such QED tests. We address these cases next. B. Hardware Extensions
With hardware support, stronger guarantees can be achievedthat lead to our final completeness result (Theorem 2). Wefirst introduce a soft-reset instruction, which transitions thenon-architectural part of a state to the initial non-architecturalstate s a ,I without changing the architectural part. Then wedefine a variant of bug-hunting extended QED tests wherewe insert soft-reset instructions in the sequence of duplicateinstructions. This way, all duplicate instructions execute in aninitial state and hence execute according to the specification forsingle-instruction correct processors. The resulting QED testalways fails, in contrast to a bug-hunting extended QED test. Definition 20. i r is a soft-reset instruction for P if for everystate ( s a , s a ) , T (( s a , s a ) , i r ) = ( s a , s a ,I ) . It is easy to see that i r ∈ N . Definition 21 (Bug-Specific Soft-Reset QED Test) . Let P besingle-instruction correct with at least one bug B = (cid:104) i b , S b (cid:105) .The instruction sequence i = (cid:104) i , . . . i n (cid:105) is a bug-specificsoft-reset QED test for P if the following conditions hold: i is a bug-hunting extended QED test for P with aminimal bug-prefix of size k ≥ and initial state s Let s = T ( s , i ) . Then, ∀ l ∈ L D . s k − ( l ) = s k ( l ) , i.e., i b = i k does not corrupt any duplicate location n = 3 k For each ≤ j ≤ k , i k +2 j − = i r Lemma 5. If P is single-instruction correct and has a bug-specific soft-reset QED test i , then i fails.Proof. See appendix.There are still a few (pathological) ways in which a bug maybe missed by searching for all possible soft-reset QED tests.First, there may be no triggering sequence starting from anyQED-consistent state. Second, it could be that the triggeringsequence for a bug requires using more than half of all thelocations, making it impossible to divide the locations amongoriginal and duplicate instructions. Finally, it could be thatthe bug always corrupts duplicate locations for every possiblecandidate sequence. These can all be remedied by adding hardreset instructions, which reset P to a specific initial state. Definition 22.
The set { i R,s I | s I ∈ S I } is a family of hardreset instructions for P if for every state s , T ( s, i R,s I ) = s I . Definition 23.
Let P be a processor. Then i = (cid:104) i . . . i k +2 (cid:105) is a bug-specific hard-reset QED test with bug-prefix size k and initial state s I for P if the following conditions hold: k ≥ (cid:104) i . . . i k (cid:105) reach and trigger a bug B = (cid:104) i b , S b (cid:105) in P starting from s I , where i k = i b i k +1 = i R,s I (cid:104) i k +2 . . . i k (cid:105) = (cid:104) i . . . i k − (cid:105) i k +1 = i r i k +2 = i k Notice that there is no notion of duplication for a hard-resetQED test. Instead, the exact same sequence is executed twiceexcept that there is a hard reset in between and a soft resetright before the last instruction. Hard-reset QED tests also usea slightly different notion of success and failure.
Definition 24.
Let i be a bug-specific hard-reset QED testwith bug-prefix size k and initial state s I , and let s = T ( s I , i ) . • i succeeds if s k ( l ) = s k +2 ( l ) for every location l ∈ L . • i fails if s k ( l ) (cid:54) = s k +2 ( l ) for some location l ∈ L . The combination of single-instruction correctness checking andexhaustive search for hard-reset QED tests is complete.
Theorem 2. If P is single-instruction correct and has nofailing bug-specific hard-reset QED tests, then it is correct.Proof. See appendix.VII. R
ELATED W ORK
Assertion-based formal verification techniques using theoremproving or (bounded) model checking, e.g., [1], [16]–[18],require implementation-specific, manually-written properties.In contrast to that, symbolic quick error detection (SQED) [7]–[10] is based on a universal self-consistency property.In an early application of self-consistency checking for pro-cessor verification without a specification [11], given instructionsequences are transformed by, e.g., inserting NOPs. The originaland the modified instruction sequence are expected to producethe same result. As a formal foundation, this approach relies onformulating and explicitly computing an equivalence relationover states, which is not needed with SQED.SQED originates from quick error detection (QED) , a post-silicon validation technique [19]–[21]. QED is highly effectivein reducing the length of existing bug traces (i.e., instructionsequences) in post-silicon debugging of processor cores. Tothis end, existing bug traces are systematically transformed into
QED tests by techniques that (among others) include instructionduplication [22]. SQED exhaustively searches for minimal-length QED tests using BMC for pre-silicon verification. It isalso applicable to post-silicon validation. SQED was extendedto operate with symbolic initial states [12], [23] to overcomethe potential limitations of BMC when unrolling the transitionrelation of a design starting in a concrete initial state.SQED employs the principle of self-consistency based on amathematical interpretation of instructions as functions. Thatprinciple is also applied by accelerator quick error detection (A-QED) [24], a formal pre-silicon verification technique for HWaccelerator designs. A-QED checks the functions implementedby an accelerator for functional consistency and, like SQED,does not require a formal specification.
Unique program execution checking [25] relies on a particularvariant of self-consistency to check security vulnerabilities ofprocessor designs for covert channel attacks. In the contextof security, self-consistency is also applied to verify secureinformation flow by self-composition of programs [26]–[29].Several approaches, including both formal and simulation-based approaches, exist for checking single-instruction (SI)correctness cf. [9], [23], [30]. Checking SI correctness iscomplementary to checking self-consistency using SQED andis also much more tractable. In a formal approach, a propertycorresponding to
Spec op (based on the ISA) is written for eachopcode op ∈ Op , and the model checker is used to ensure thatthe property holds when starting from any initial state. Becausethe approach is restricted to initial states and only a singleinstruction execution, it is much simpler to specify and checkthan would be a property specifying the full correctness of P .Efficient specialized approaches exist for checking multiplierunits [31]–[34], which is computationally hard.VIII. C ONCLUSION AND F UTURE W ORK
We laid a formal foundation for symbolic quick errordetection (SQED) and presented a theoretical framework toreason about its bug-finding capabilities. In our framework,we proved soundness as well as (conditional) completeness,thereby closing a gap in the theoretical understanding of SQED.Soundness implies that SQED does not produce spuriouscounterexamples, i.e., any counterexample to QED-consistencyreported by SQED corresponds to an actual bug in the design.For completeness, we characterized a large class of bugs thatcan be detected by failing QED tests under modest assumptionsabout these bugs. We also identified several QED test extensionsbased on executing no-operation and reset instructions. Forthese extensions, we proved even stronger completeness guar-antees, ultimately leading to a variant of SQED that, togetherwith single-instruction correctness, is complete.As future work, it would be valuable to extend our frameworkto consider variants of SQED that operate with more fullysymbolic initial states [12], [23]. The challenge will be toidentify how this can be done while guaranteeing no spuriouscounterexamples. For practical applications, our theoreticalresults provide valuable insights. For example, in presentimplementations of SQED [9], [10], the flexibility to partitionregister/memory locations into sets of original and duplicatelocations and to select the bijective mapping between thesetwo sets has not yet been explored. Similarly, it is promisingto combine standard QED tests and the specialized extensionswe presented in a uniform practical tool framework. Featureslike soft/hard reset instructions could either be implementedin HW in a design-for-verification approach or in softwareinside a model checker. In another research direction, we planto extend our framework to model the detection of deadlocksusing SQED, cf. [7], and prove related theoretical guarantees.
Acknowledgments.
We thank Karthik Ganesan and JohnTigar Humphries for helpful initial discussions and the anony-mous reviewers for their feedback.9
EFERENCES[1] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic ModelChecking without BDDs,” in
Proc. TACAS , ser. LNCS, vol. 1579.Springer, 1999, pp. 193–207.[2] S. Katz, O. Grumberg, and D. Geist, “"Have I written enough Properties?"- A Method of Comparison between Specification and Implementation,”in
Proc. CHARME , ser. LNCS, vol. 1703. Springer, 1999, pp. 280–297.[3] H. Chockler, O. Kupferman, and M. Y. Vardi, “Coverage Metrics forTemporal Logic Model Checking,” in
Proc. TACAS , ser. LNCS, vol. 2031.Springer, 2001, pp. 528–542.[4] K. Claessen, “A Coverage Analysis for Safety Property Lists,” in
Proc. FMCAD . IEEE, 2007, pp. 139–145.[5] D. Große, U. Kühne, and R. Drechsler, “Estimating functional coveragein bounded model checking,” in
Proc. DATE . EDA Consortium, SanJose, CA, USA, 2007, pp. 1176–1181.[6] H. Chockler, D. Kroening, and M. Purandare, “Coverage in interpolation-based model checking,” in
Proc. DAC . ACM, 2010, pp. 182–187.[7] D. Lin, E. Singh, C. Barrett, and S. Mitra, “A structured approach topost-silicon validation and debug using symbolic quick error detection,”in
Proc. ITC . IEEE, 2015, pp. 1–10.[8] E. Singh, D. Lin, C. Barrett, and S. Mitra, “Logic bug detection andlocalization using symbolic quick error detection,”
IEEE Transactionson Computer-Aided Design of Integrated Circuits and Systems , pp. 1–1,2018.[9] E. Singh, K. Devarajegowda, S. Simon, R. Schnieder, K. Ganesan, M. R.Fadiheh, D. Stoffel, W. Kunz, C. W. Barrett, W. Ecker, and S. Mitra,“Symbolic QED Pre-Silicon Verification for Automotive MicrocontrollerCores: Industrial Case Study,” in
Proc. DATE . IEEE, 2019, pp. 1000–1005.[10] F. Lonsing, K. Ganesan, M. Mann, S. S. Nuthakki, E. Singh, M. Srouji,Y. Yang, S. Mitra, and C. W. Barrett, “Unlocking the Power of FormalHardware Verification with CoSA and Symbolic QED: Invited Paper,”in
Proc ICCAD . ACM, 2019, pp. 1–8.[11] R. B. Jones, C. H. Seger, and D. L. Dill, “Self-Consistency Checking,”in
Proc. FMCAD , ser. LNCS, vol. 1166. Springer, 1996, pp. 159–171.[12] M. R. Fadiheh, J. Urdahl, S. S. Nuthakki, S. Mitra, C. Barrett, D. Stoffel,and W. Kunz, “Symbolic quick error detection using symbolic initial statefor pre-silicon verification,” in
Proc. DATE . IEEE, 2018, pp. 55–60.[13] R. M. Keller, “A Fundamental Theorem of Asynchronous Parallel Com-putation,” in
Parallel Processing, Proc. Sagamore Computer Conference ,ser. LNCS, vol. 24. Springer, 1974, pp. 102–112.[14] R. M. Keller, “Formal Verification of Parallel Programs,”
Commun. ACM ,vol. 19, no. 7, pp. 371–384, 1976.[15] B. Huang, H. Zhang, P. Subramanyan, Y. Vizel, A. Gupta, and S. Malik,“Instruction-Level Abstraction (ILA): A Uniform Specification for System-on-Chip (SoC) Verification,”
ACM Trans. Design Autom. Electr. Syst. ,vol. 24, no. 1, pp. 10:1–10:24, 2019.[16] W. A. Hunt Jr., “Microprocessor design verification,”
J. Autom. Reasoning ,vol. 5, no. 4, pp. 429–460, 1989.[17] J. R. Burch and D. L. Dill, “Automatic Verification of PipelinedMicroprocessor Control,” in
Proc. CAV , ser. LNCS, vol. 818. Springer,1994, pp. 68–80.[18] A. Biere, E. M. Clarke, R. Raimi, and Y. Zhu, “Verifiying SafetyProperties of a Power PC Microprocessor Using Symbolic ModelChecking without BDDs,” in
Proc. CAV , ser. LNCS, vol. 1633. Springer,1999, pp. 60–71. [19] T. Hong, Y. Li, S. Park, D. Mui, D. Lin, Z. A. Kaleq, N. Hakim,H. Naeimi, D. S. Gardner, and S. Mitra, “QED: Quick Error Detectiontests for effective post-silicon validation,” in
Proc. ITC . IEEE, 2010,pp. 154–163.[20] D. Lin, T. Hong, Y. Li, F. Fallah, D. S. Gardner, N. Hakim, andS. Mitra, “Overcoming post-silicon validation challenges through quickerror detection (QED),” in
Proc. DATE . EDA Consortium San Jose,CA, USA / ACM DL, 2013, pp. 320–325.[21] D. Lin, T. Hong, Y. Li, E. S, S. Kumar, F. Fallah, N. Hakim, D. S.Gardner, and S. Mitra, “Effective Post-Silicon Validation of System-on-Chips Using Quick Error Detection,”
IEEE Trans. on CAD of IntegratedCircuits and Systems , vol. 33, no. 10, pp. 1573–1590, 2014.[22] N. Oh, P. P. Shirvani, and E. J. McCluskey, “Error detection by duplicatedinstructions in super-scalar processors,”
IEEE Trans. Reliability , vol. 51,no. 1, pp. 63–75, 2002.[23] K. Devarajegowda, M. R. Fadiheh, E. Singh, C. Barrett, S. Mitra,W. Ecker, D. Stoffel, and W. Kunz, “Gap-free Processor Verification byS QED and Property Generation,” in
Proc. DATE . IEEE, 2020.[24] E. Singh, F. Lonsing, S. Chattopadhyay, M. Strange, P. Wei, X. Zhang,Y. Zhou, D. Chen, J. Cong, P. Raina, Z. Zhang, C. Barrett, and S. Mitra,“A-QED Verification of Hardware Accelerators,” in
Proc. DAC, to appear .ACM, 2020.[25] M. R. Fadiheh, D. Stoffel, C. W. Barrett, S. Mitra, and W. Kunz,“Processor Hardware Security Vulnerabilities and their Detection byUnique Program Execution Checking,” in
Proc. DATE . IEEE, 2019,pp. 994–999.[26] G. Barthe, P. R. D’Argenio, and T. Rezk, “Secure Information Flow bySelf-Composition,” in
Proc. CSFW-17 . IEEE, 2004, pp. 100–114.[27] G. Barthe, J. M. Crespo, and C. Kunz, “Relational Verification UsingProduct Programs,” in
Proc. FM , ser. LNCS, vol. 6664. Springer, 2011,pp. 200–214.[28] J. B. Almeida, M. Barbosa, G. Barthe, F. Dupressoir, and M. Emmi,“Verifying Constant-Time Implementations,” in
Proc. USENIX . USENIXAssociation, 2016, pp. 53–70.[29] W. Yang, Y. Vizel, P. Subramanyan, A. Gupta, and S. Malik, “LazySelf-composition for Security Verification,” in
Proc. CAV , ser. LNCS,vol. 10982. Springer, 2018, pp. 136–156.[30] A. Reid, R. Chen, A. Deligiannis, D. Gilday, D. Hoyes, W. Keen,A. Pathirane, O. Shepherd, P. Vrabel, and A. Zaidi, “End-to-EndVerification of Processors with ISA-Formal,” in
Proc. CAV , ser. LNCS,vol. 9780. Springer, 2016, pp. 42–58.[31] U. Krautz, M. Wedler, W. Kunz, K. Weber, C. Jacobi, and M. Pflanz,“Verifying full-custom multipliers by Boolean equivalence checking andan arithmetic bit level proof,” in
ASP-DAC . IEEE, 2008, pp. 398–403.[32] A. A. R. Sayed-Ahmed, D. Große, U. Kühne, M. Soeken, and R. Drech-sler, “Formal verification of integer multipliers by combining Gröbnerbasis with logic reduction,” in
Proc. DATE , 2016, pp. 1048–1053.[33] D. Ritirc, A. Biere, and M. Kauers, “Column-wise verification ofmultipliers using computer algebra,” in
Proc. FMCAD , 2017, pp. 23–30.[34] D. Kaufmann, A. Biere, and M. Kauers, “Verifying Large Multipliersby Combining SAT and Computer Algebra,” in
Proc. FMCAD . IEEE,2019, pp. 28–36. PPENDIX AP ROOFS
Proof of Lemma 1.
Assume that the antecedent of the im-plication holds, and let l O ∈ L O be an arbitrary originalmemory location. If l O = L out ( i O ) , then s (cid:48) ( L out ( i O )) = s (cid:48)(cid:48) ( L D ( L out ( i O ))) by Corollary 1.Suppose, on the other hand, that l O (cid:54) = L out ( i O ) . Let l D = L D ( l O ) be the corresponding duplicate location. Bythe injectivity of L D , we have l D (cid:54) = L D ( L out ( i O )) , andthus l D (cid:54) = L out ( i D ) . We can thus conclude from (1) that s ( l O ) = s (cid:48) ( l O ) and s ( l D ) = s (cid:48)(cid:48) ( l D ) .Finally, since s ( l O ) = s ( l D ) by assumption, we derive s (cid:48) ( l O ) = s (cid:48)(cid:48) ( l D ) , that is, s (cid:48) ( l O ) = s (cid:48)(cid:48) ( L D ( l O )) . Proof of Case B of Lemma 4.
Since
QEDcons ( s ) , we have s ( l orig ) = s ( l dup ) (15)Since Spec ( s , i , s ) by Definition 16, we have s ( l orig ) = s ( l orig ) and s ( l dup ) = s ( l dup ) , and so it follows that, s ( l orig ) = s ( l dup ) (16)Due to the requirements in Case B of Definition 16, we have s n +1 ( l dup ) = s n ( l dup ) (17) s ( l orig ) = s n ( l orig ) (18) s ( l dup ) = s n ( l dup ) (19)Now, because we are in Case B, we know that s n ( l dup ) (cid:54) = s n +1 ( l dup ) , so by (17), s n ( l dup ) (cid:54) = s n ( l dup ) (20)But (16), (18) and (19) give us: s n ( l dup ) = s n ( l orig ) (21)Thus, by (20) and (21), s n ( l dup ) (cid:54) = s n ( l orig ) (22)and hence ¬ QEDcons ( s n ) . Proof of Lemma 5.
Because k is minimal, we know that i . . . i k − all execute according to their specification. For i k + j ,if j is odd, then it is a no-operation and therefore it changes nolocation values, and if j is even, then it executes according tospecification because it is executing in an initial state. Let l besome original location whose value is incorrect after the buggyinstruction i k = i b executes (we know the buggy instructioncorrupts an original location because it does not corrupt aduplicate location), with l D = L D ( l ) . We consider a type-Bbug as in Definition 16 first and assume that l is not the outputlocation of i k . Since s is QED-consistent, s ( l ) = s ( l D ) .Since instructions through k do not change duplicate locations,we have s ( l ) = s k ( l D ) . By repeated application of Lemma 1and by definition of no-operation instructions, we can thenconclude that s k − ( l ) = s k − ( l D ) . Then, because of the bug,it follows that s k ( l ) (cid:54) = s k ( l D ) . Finally, because none of the instructions after i k modify original locations, s k ( l ) = s k ( l ) ,so ¬ QEDcons ( s k ) .The case of a type-A bug where l is the output location of i k can be proved analogously. Proof of Theorem 2.