[PDF] A Theoretical Framework for Symbolic Quick Error Detection

Abstract

Symbolic quick error detection (SQED) is a formal pre-silicon verification technique targeted at processor designs. It leverages bounded model checking (BMC) to check a design for counterexamples to a self-consistency property: given the instruction set architecture (ISA) of the design, executing an instruction sequence twice on the same inputs must always produce the same outputs. Self-consistency is a universal, implementation-independent property. Consequently, in contrast to traditional verification approaches that use implementation-specific assertions (often generated manually), SQED does not require a full formal design specification or manually-written properties. Case studies have shown that SQED is effective for commercial designs and that SQED substantially improves design productivity. However, until now there has been no formal characterization of its bug-finding capabilities. We aim to close this gap by laying a formal foundation for SQED. We use a transition-system processor model and define the notion of a bug using an abstract specification relation. We prove the soundness of SQED, i.e., that any bug reported by SQED is in fact a real bug in the processor. Importantly, this result holds regardless of what the actual specification relation is. We next describe conditions under which SQED is complete, that is, what kinds of bugs it is guaranteed to find. We show that for a large class of bugs, SQED can always find a trace exhibiting the bug. Ultimately, we prove full completeness of a variant of SQED that uses specialized state reset instructions. Our results enable a rigorous understanding of SQED and its bug-finding capabilities and give insights on how to optimize implementations of SQED in practice.

Full PDF

AA Theoretical Framework for SymbolicQuick Error Detection

Florian Lonsing, Subhasish Mitra, and Clark Barrett

Computer Science Department, Stanford University, Stanford, CA 94305, USAE-mail: {lonsing, subh, barrett}@stanford.edu

Abstract —Symbolic quick error detection (SQED) is a formalpre-silicon veriﬁcation technique targeted at processor designs.It leverages bounded model checking (BMC) to check a de-sign for counterexamples to a self-consistency property: giventhe instruction set architecture (ISA) of the design, executingan instruction sequence twice on the same inputs must al-ways produce the same outputs. Self-consistency is a universal,implementation-independent property. Consequently, in contrastto traditional veriﬁcation approaches that use implementation-speciﬁc assertions (often generated manually), SQED does notrequire a full formal design speciﬁcation or manually-writtenproperties. Case studies have shown that SQED is effectivefor commercial designs and that SQED substantially improvesdesign productivity. However, until now there has been no formalcharacterization of its bug-ﬁnding capabilities. We aim to closethis gap by laying a formal foundation for SQED. We use atransition-system processor model and deﬁne the notion of a bugusing an abstract speciﬁcation relation. We prove the soundnessof SQED, i.e., that any bug reported by SQED is in fact a real bugin the processor. Importantly, this result holds regardless of whatthe actual speciﬁcation relation is. We next describe conditionsunder which SQED is complete, that is, what kinds of bugs it isguaranteed to ﬁnd. We show that for a large class of bugs, SQEDcan always ﬁnd a trace exhibiting the bug. Ultimately, we provefull completeness of a variant of SQED that uses specialized statereset instructions. Our results enable a rigorous understandingof SQED and its bug-ﬁnding capabilities and give insights onhow to optimize implementations of SQED in practice.

I. I

NTRODUCTION

Pre-silicon veriﬁcation of HW designs given as models in aHW description language (e.g., Verilog) is a critical step in HWdesign. Due to the steadily increasing complexity of designs, itis crucial to detect logic design bugs before fabrication to avoidmore difﬁcult and costly debugging in post-silicon validation.Formal techniques such as bounded model checking(BMC) [1] have an advantage over traditional pre-siliconveriﬁcation techniques such as simulation in that they areexhaustive up to the BMC bound. Hence, formal techniquesprovide valuable guarantees about the correctness of a designunder veriﬁcation (DUV) with respect to the checked properties.However, in traditional assertion-based formal veriﬁcation tech-niques, these properties are implementation-speciﬁc and mustbe written manually based on expert knowledge about the DUV.Moreover, it is a well-known, long-standing challenge that setsof manually-written, implementation-speciﬁc properties mightbe insufﬁcient to detect all bugs present in a DUV [2]–[6].

This work was supported by the Defense Advanced Research ProjectsAgency, grant FA8650-18-2-7854.

Article to appear in Proc. FMCAD 2020.

Symbolic quick error detection (SQED) [7]–[10] is a formalpre-silicon veriﬁcation technique targeted at processor designs.In sharp contrast to traditional formal approaches, SQED doesnot require manually-written properties or a formal speciﬁcationof the DUV. Instead, it checks whether a self-consistency [11]property holds in the DUV. The self-consistency property em-ployed by SQED is universal and implementation-independent.Each instruction in the instruction set architecture (ISA) of theDUV is interpreted as a function in a mathematical sense. Theself-consistency check then amounts to checking whether theoutputs produced by executing a particular instruction sequencematch if the sequence is executed twice, assuming the inputsto the two sequences also match.SQED leverages BMC to exhaustively explore all possibleinstruction sequences up to a certain length starting from aset of initial states. Several case studies have demonstratedthat SQED is highly effective at producing short bug tracesby ﬁnding counterexamples to self-consistency in a variety ofprocessor designs, including industrial designs [9]. Moreover,SQED substantially increases veriﬁcation productivity.However, until now there has been no rigorous theoretical un-derstanding of (A) whether counterexamples to self-consistencyfound by SQED always correspond to actual bugs in the DUV—the soundness of SQED—and (B) whether for each bug in theDUV there exists a counterexample to self-consistency thatSQED can ﬁnd—the completeness of SQED. This paper makessigniﬁcant progress towards closing this gap.We model a processor as a transition system. This modelabstracts away implementation-level details, yet is sufﬁcientlyprecise to formalize the workings of SQED. To prove soundnessand (conditional) completeness of SQED, we need to establisha correspondence between counterexamples to self-consistencyand bugs in a DUV. In our formal model we achieve thiscorrespondence by ﬁrst deﬁning the correctness of instructionexecutions by means of a general, abstract speciﬁcation. Abug is then a violation of this speciﬁcation. The abstractspeciﬁcation expresses the following general and naturalproperty we expect to hold for actual DUVs: an instructionwrites a correct output value into a destination location anddoes not modify any other locations.As our main results , we prove soundness and conditionalcompleteness of SQED. For soundness, we prove that if SQEDreports a counterexample to the universal self-consistency prop-erty, then the processor has a bug. This result shows that SQEDdoes not produce spurious counterexamples. Importantly, this1 a r X i v : . [ c s . L O ] A ug esult holds regardless of the actual speciﬁcation, conﬁrmingthat SQED does not depend on such implementation-speciﬁcdetails. For completeness, we prove that if the processor has abug then, under modest assumptions, there exists a counterex-ample to self-consistency that can be found by SQED. We alsoshow that SQED can be made fully (unconditionally) completewith additional HW support in the form of specialized statereset instructions. Our results enable a rigorous understandingof SQED and its bug-ﬁnding capabilities in actual DUVs andprovide insight on how to optimize implementations of SQED.In the following, we ﬁrst present an overview of SQEDfrom a theoretical perspective (Section II). Then we deﬁneour transition system model of processors (Section III) andformalize the correctness of instruction executions in terms ofan abstract speciﬁcation relation (Section IV). After establishinga correspondence between the abstract speciﬁcation and theself-consistency property employed by SQED (Section V),we prove soundness and (conditional) completeness of SQED(Section VI). We conclude with a discussion of related workand future research directions (Sections VII and VIII).II. O VERVIEW OF

SQEDWe ﬁrst informally introduce the basic concepts and termi-nology related to SQED. Fig. 1a shows an overview of thehigh-level workﬂow. Given a processor design P , i.e., the DUV,SQED is based on symbolic execution of instruction sequencesusing BMC. We assume that an instruction i = ( op , l, ( l (cid:48) , l (cid:48)(cid:48) )) consists of an opcode op , an output location l , and a pair ( l (cid:48) , l (cid:48)(cid:48) ) of input locations. Locations are an abstraction usedto represent registers and memory locations.The self-consistency check is based on executing two in-structions that should always produce the same result. The twoinstructions are called an original and a duplicate instruction ,respectively. The duplicate instruction has the same opcode asthe original one, i.e., it implements the same functionality, but itoperates on different input and output locations. The locationson which the duplicate instruction operates are determinedby an arbitrary but ﬁxed bijective function L D : L O → L D between two subsets L O , the original locations , and L D , the duplicate locations , that form a partition of the set L of alllocations in P . An original instruction can only use locationsin L O . An instruction duplication function Dup then mapsany original instruction i O to its duplicate i D by copying theopcode and then applying L D to its locations. Example 1.

Let L = { , . . . , } be the identiﬁers of 32registers of a processor P , and consider the partition L O = { , , . . . , } and L D = { , , . . . , } . Let i O =( ADD , l , ( l , l )) be an original register-type ADD instructionoperating on registers , , and . Using L D ( k ) = k + 16 ,we obtain Dup ( i O ) = i D = ( ADD , l , ( l , l )) .Consider a different partition L (cid:48) O = { , , , . . . , } and L (cid:48) D = { , , , . . . , } and function L (cid:48) D ( k ) = k + 1 . For thisfunction, Dup ( i O ) = ( ADD , l , ( l , l )) . This model is used for simplicity, but it could easily be extended to allowinstructions with additional inputs or outputs.

Self-consistency checking is implemented using

QED tests .A QED test is an instruction sequence i = i O :: i D consistingof a sequence i O of n original instructions followed by a corre-sponding sequence i D = Dup ( i O ) of n duplicate instructions(where operator “::” denotes concatenation). A QED test i issymbolically executed from a QED-consistent state , that is, astate where the value stored in each original location l is thesame as the value stored in its corresponding duplicate location L D ( l ) . The resulting ﬁnal state after executing i should thenalso be QED-consistent. Fig. 1a illustrates the workﬂow. AQED test i succeeds if the ﬁnal state that results from executing i is QED-consistent; otherwise it fails . Starting the executionin a QED-consistent state guarantees that original and duplicateinstructions receive the same input values. Thus, if the ﬁnalstate is not QED-consistent, then this indicates that some pairof original and duplicate instructions behaved differently. Example 2.

Consider Fig. 1b and the QED test i = i O :: i D consisting of one original instruction i O and its duplicate Dup ( i O ) = i D for some function L D . Suppose that i is exe-cuted in a QED-consistent state s (denoted by QEDcons ( s ) and s ( L O ) = s ( L D ) ) and both i O and i D execute correctly.Instruction i O produces state s , where the values at duplicatelocations remain unchanged, i.e., s ( L D ) = s ( L D ) , because i O operates on original locations only. When instruction i D isexecuted in state s , it modiﬁes only duplicate locations. Theﬁnal state s is QED-consistent (denoted by QEDcons ( s ) and s ( L O ) = s ( L D ) ), and thus QED test i succeeds. Example 3 (Bug Detection) . Consider processor P and L O and L D from Example 1. Let i O, = ( ADD , l , ( l , l )) and i O, = ( MUL , l , ( l , l )) be original register-typeaddition and multiplication instructions. Using L D ( k ) = k +16 ,we obtain Dup ( i O, ) = i D, = ( ADD , l , ( l , l )) and Dup ( i O, ) = i D, = ( MUL , l , ( l , l )) . Assume that P has a bug that is triggered when two MUL instructions areexecuted in subsequent clock cycles, resulting in the corruptionof the output location of the second MUL instruction. Notethat executing the QED test i = i O, , i O, :: i D, , i D, ina QED-consistent initial state produces a QED-consistentﬁnal state: the bug is not triggered by i because i D, is executed between i O, and i D, . A slightly longer test i = i O, , i O, , i O, :: i D, , i D, , i D, does trigger the bug,however, because the subsequence i O, , i D, of two back-to-back MULs causes the ﬁrst duplicate instruction i D, in i toproduce an incorrect result at l . This incorrect result thenpropagates through the next two instructions, resulting in aQED-inconsistent ﬁnal state since the values at l and l ,i.e., the output locations of i O, and i D, , differ. QED-consistency is the universal, implementation-indepen-dent property that is checked in SQED. In practice, the propertymust refer to some basic information about the design suchas, e.g., symbolic register names, but this can be generatedautomatically from a high-level ISA description [10]. BMC This scenario corresponds to a real bug in an out-of-order RISC-V designdetected by SQED: https://github.com/ridecore/ridecore/issues/4. a) (b)Fig. 1. SQED workﬂow from a theoretical perspective (a) and illustration of executing the QED test i = i O :: i D in Example 2 (b). is used to symbolically and exhaustively generate all possibleQED tests up to a certain length n (the BMC bound). BMCensures that SQED will ﬁnd the shortest possible failing QEDtest ﬁrst. The high-level workﬂow shown in Fig. 1a allowsfor ﬂexibility in choosing the partition and mapping betweenoriginal and duplicate locations. We rely on this ﬂexibility forthe results in this paper (Theorems 1 and 2). Current SQEDimplementations use a predeﬁned partition and mapping, basedon which BMC enumerates all possible QED tests. Extendingimplementations to have the BMC tool also choose a partitionand mapping could be explored in future work.We refer to related work [7], [9], [12] for case studiesthat demonstrate the effectiveness of BMC-based SQED ona variety of processor designs. The scalability of SQED inpractice is determined by the scalability of the BMC tool beingused. Thus, approaches for improving scalability of BMC canalso be applied to SQED, e.g. abstraction, decomposition, andpartial instantiation techniques [7].III. I NSTRUCTION AND P ROCESSOR M ODEL

We model a processor as a transition system containing anabstract set of locations. The set of locations includes registersand memory locations. A state of a processor consists of an architectural and a non-architectural part. In a state transitionthat results from executing an instruction, the architectural partof a state is modiﬁed explicitly by updating the value at theoutput location of the executed instruction. The architecturalpart of a state is also called the software-visible state of theprocessor. It comprises those parts of the state that can beupdated by executing instructions of the user-level ISA ofthe processor, such as memory locations and general-purposeregisters. The non-architectural part of a state comprises theremaining parts that are updated only implicitly by executingan instruction, such as pipeline or status registers.Instructions are functions that take inputs from locations andwrite an output to a location. We assume that every instructionproduces its result in one transition. In our model, we abstractaway implementation details of complex processor designs (e.g., pipelined, out-of-order, multi-processor systems). This isfor ease of presentation and reasoning. However, many of thesecomplexities can be viewed as reﬁnements of our abstraction,meaning that our formal results still hold on complex models(i.e., our results can be lowered to more detailed models suchas those described in [7], [8]). Working out the details of suchreﬁnements is one important avenue for future work.

Deﬁnition 1 (Transition System) . A processor is a transitionsystem [13], [14] P = ( V , L , S a , s a ,I , Op , I, T ) , where • V is a set of abstract data values , • L is a set of memory locations (from which we deﬁne theset S a of architectural states as the set of total functionsfrom locations to values, i.e. S a = { s a | s a : L → V} ), • S a is a set of non-architectural states (from which wefurther deﬁne the set of all states as S = S a × S a ), • s a ,I ∈ S a is a unique initial non-architectural state (fromwhich we deﬁne the set of initial states as S I = S a ×{ s a ,I } , • Op is a set of operation codes (opcodes) , • I = Op × L × L is the set of instructions , and • T : S × I → S is the transition function , which is total. A state s ∈ S with s = ( s a , s a ) consists of an architecturalpart s a ∈ S a and a non-architectural part s a ∈ S a . In thearchitectural part s a : L → V , L represents all possibleregisters and memory locations, i.e., in practical terms, L is the address space of P . An initial state s I ∈ S I with s I = ( s a , s a ,I ) is deﬁned by a unique non-architectural part s a ,I ∈ S a and an arbitrary architectural part s a ∈ S a . Weassume that s a ,I ∈ S a is unique to make the exposition simpler.Our model could easily be extended to a set of initial non-architectural states. The number |L| of memory locations isarbitrary but ﬁxed. We write v = s ( l ) to denote the value v = s a ( l ) at location l ∈ L in state s = ( s a , s a ) . We alsowrite ( v, v (cid:48) ) = s ( l, l (cid:48) ) as shorthand for v = s ( l ) and v (cid:48) = s ( l (cid:48) ) .To formally deﬁne instruction duplication, we need to reasonabout original and duplicate memory locations. To this end,we partition the set L of memory locations into two sets3f equal size, the original and duplicate locations L O and L D , respectively, i.e., L O ∩ L D = ∅ , L O ∪ L D = L , and |L O | = |L D | . Given L O and L D , we deﬁne an arbitrary butﬁxed bijective function L D : L O → L D that maps an originallocation l O ∈ L O to its corresponding duplicate location l D = L D ( l O ) . The inverse of L D is denoted by L D − and is uniquelydeﬁned. We write ( l D , l (cid:48) D ) = L D ( l O , l (cid:48) O ) as shorthand for l D = L D ( l O ) and l (cid:48) D = L D ( l (cid:48) O ) . Function L D implements acorrespondence between original and duplicate locations, whichwe need to deﬁne QED-consistency (Deﬁnition 11 below).An instruction i ∈ I with i = ( op , l, ( l (cid:48) , l (cid:48)(cid:48) )) is deﬁned byan opcode op ∈ Op , an output location l ∈ L , and a pair ofinput locations ( l (cid:48) , l (cid:48)(cid:48) ) ∈ L . Function op : I → Op maps aninstruction to its opcode op ( i ) . Functions L out : I → L and L in : I → L map an instruction i to its output and inputlocations L out ( i ) = l and L in ( i ) = ( l (cid:48) , l (cid:48)(cid:48) ) , respectively. Givena state s = ( s a , s a ) , instruction i reads values in s from itsinput locations L in ( i ) and writes a value to its output location L out ( i ) , resulting in a transition to a new state s (cid:48) = ( s (cid:48) a , s (cid:48) a ) ,written as s (cid:48) = T ( s, i ) . The transition function T is total, i.e.,for every instruction i and state s , there exists a successor state s (cid:48) = T ( s, i ) . As mentioned above, we have kept the modelsimple in order to make the presentation more accessible, butour results can be lifted to many extensions, including, e.g.,more complicated kinds of instructions or instructions withenabledness conditions cf. [15].We write i ∈ I n and s ∈ S n to denote sequences i = (cid:104) i , . . . , i n (cid:105) and s = (cid:104) s , . . . , s n (cid:105) of n instructions and n states, respectively. We will use :: for sequence concatenationand extend the transition function T to sequences as follows. Deﬁnition 2 (Path) . Given sequences i = (cid:104) i , . . . , i n (cid:105) and s = (cid:104) s , . . . , s n (cid:105) of n instructions and states, s is a path fromstate s ∈ S to s n via i , written s = T ( s , i ) , iff (cid:86) n − k =0 s k +1 = T ( s k , i k +1 ) . If s = T ( s , i ) , then for convenience we also write s n = T ( s , i ) to denote the ﬁnal state s n . Deﬁnition 3 (Reachable State) . A state s is reachable , written reach ( s ) , iff s = T ( s , i ) for some s ∈ S I and instructionsequence i . The set I of instructions contains as proper subsets the setsof original and duplicate instructions , I O and I D , respectively.Original (duplicate) instructions operate only on original (dupli-cate) locations, i.e., ∀ i O ∈ I O . L in ( i O ) ∈ L O ∧ L out ( i O ) ∈ L O and ∀ i D ∈ I D . L in ( i D ) ∈ L D ∧ L out ( i D ) ∈ L D . Given thesedeﬁnitions, we formalize instruction duplication as follows. Deﬁnition 4 (Instruction Duplication) . Let

Dup : I O → I D be an instruction duplication function that maps an originalinstruction i O = ( op , l O , ( l (cid:48) O , l (cid:48)(cid:48) O )) to a duplicate instruction i D = Dup ( i O ) = ( op , L D ( l O ) , L D ( l (cid:48) O , l (cid:48)(cid:48) O )) with respect tothe bijective function L D . An original instruction and its duplicate have the same opcode.We write i O ∈ I nO and i D ∈ I nD to denote sequences i O = (cid:104) i O, , . . . , i O,n (cid:105) and i D = (cid:104) i D, , . . . , i D,n (cid:105) of n original and duplicate instructions, respectively. We lift Dup in the naturalway also to sequences of instructions as follows.

Deﬁnition 5 (Instruction Sequence Duplication) . Let i O = (cid:104) i O, , . . . , i O,n (cid:105) be a sequence of original instructions. Then

Dup ( i O ) = (cid:104) Dup ( i O, ) , . . . , Dup ( i O,n ) (cid:105) . IV. F

ORMALIZING C ORRECTNESS

We formalize the correctness of instruction executions in aprocessor P using an abstract speciﬁcation relation. We thenlink this abstract speciﬁcation to QED-consistency, the self-consistency property employed by SQED (Section V below).For our formalization, we assume that every opcode op ∈ Op has a speciﬁcation function Spec op : V → V that speciﬁeshow the opcode computes an output value from input values.Using this family of functions, we deﬁne an overall abstractspeciﬁcation relation Spec ⊆ S × I × S , which expresses whenan instruction i ∈ I can transition to a state s (cid:48) ∈ S from astate s ∈ S while respecting the opcode speciﬁcation. Deﬁnition 6 (Abstract Speciﬁcation) . ∀ s, s (cid:48) ∈ S, i ∈ I. Spec ( s, i, s (cid:48) ) ↔ ∀ l ∈ L . ( l (cid:54) = L out ( i ) → s ( l ) = s (cid:48) ( l )) ∧ (1) ( l = L out ( i ) → s (cid:48) ( l ) = Spec op ( i ) ( s ( L in ( i )))) Equation (1) states general and natural properties that weexpect to hold for a processor P . If an instruction i executesaccording to its speciﬁcation, then the values at locations thatare not output locations of i are unchanged. Additionally, thevalue produced at the output location of the instruction mustagree with the value speciﬁed by function Spec op ( i ) . Notethat the speciﬁcation relation Spec speciﬁes only how thearchitectural part of a state is updated by a transition (not thenon-architectural part). Consequently, there might exist multiplestates whose non-architectural parts satisfy the right-hand sideof (1). This is why

Spec is a relation rather than a function.As special cases of (1), original and duplicate instructions havethe following properties: ∀ s, s (cid:48) ∈ S, i O ∈ I O , l O ∈ L O , i D ∈ I D , l D ∈ L D . ( Spec ( s, i O , s (cid:48) ) → s ( l D ) = s (cid:48) ( l D )) ∧ (2) ( Spec ( s, i D , s (cid:48) ) → s ( l O ) = s (cid:48) ( l O )) (3)Equations (2) and (3) express that the execution of an original(duplicate) instruction does not change the values at duplicate(original) locations if the instruction executes according to itsspeciﬁcation. The following functional congruence property ofinstructions also follows from (1): ∀ s , s , s (cid:48) , s (cid:48)(cid:48) ∈ S, i, i (cid:48) ∈ I. (cid:2) op ( i ) = op ( i (cid:48) ) ∧ Spec ( s , i, s (cid:48) ) ∧ Spec ( s , i (cid:48) , s (cid:48)(cid:48) ) ∧ (4) s ( L in ( i )) = s ( L in ( i (cid:48) )) (cid:3) → s (cid:48) ( L out ( i )) = s (cid:48)(cid:48) ( L out ( i (cid:48) )) By functional congruence, if two instructions with the sameopcode are executed on inputs with the same values, then theoutput values are the same. We next deﬁne the correctness ofa processor P based on the abstract speciﬁcation Spec .4 eﬁnition 7 (Correctness) . A processor P is correct withrespect to speciﬁcation Spec iff ∀ i ∈ I, s ∈ S. reach ( s ) → Spec ( s, i, T ( s, i )) . Correctness requires every instruction to execute according tothe abstract speciﬁcation

Spec in every reachable state of P .A bug in P is a counterexample to correctness, i.e., aninstruction that fails in at least one (not necessarily initial)reachable state and may or may not fail in other states. Deﬁnition 8 (Bug) . A bug with respect to speciﬁcation Spec in a processor P is deﬁned by a pair B = (cid:104) i b , S b (cid:105) consistingof an instruction i b ∈ I and a non-empty set S b ⊆ S of statessuch that S b = { s ∈ S | reach ( s ) ∧ ¬ Spec ( s, i b , T ( s, i b )) } . The above deﬁnitions rely on the notion of an abstractspeciﬁcation relation. Having some abstract speciﬁcation is a theoretical construct that is necessary to formally characterizeinstruction failure and establish formal proofs about SQED.However, it is important to note that to apply SQED in practice ,we do not need to know what the abstract speciﬁcation relationis.A bug (cid:104) i b , S b (cid:105) is precisely characterized by the set S b of allreachable states in which i b fails. The following propositionfollows from Deﬁnitions 7 and 8. Proposition 1.

A processor P has a bug with respect tospeciﬁcation Spec iff it is not correct with respect to

Spec . As special cases of processor correctness and bugs, re-spectively, we deﬁne correctness and bugs with respect toinstructions that are executed in an initial state only.

Deﬁnition 9 (Single-Instruction Correctness) . Processor P is single-instruction correct iff: ∀ i ∈ I, s ∈ S I . Spec ( s , i, T ( s , i )) . Single-instruction correctness implies that all instructions, i.e.,all opcodes and all combinations of input and output locations,execute correctly in all initial states. A single-instruction bug is a counterexample to single-instruction correctness.

Deﬁnition 10 (Single-Instruction Bug) . Processor P has a single-instruction bug with respect to speciﬁcation Spec iff ∃ i ∈ I, s ∈ S I . ¬ Spec ( s , i, T ( s , i )) . Several approaches exist for single-instruction checking of aprocessor, which is complementary to SQED (cf. Section VII).V. S

ELF -C ONSISTENCY AS

QED-C

ONSISTENCY

We now deﬁne QED-consistency (cf. Section II) as a propertyof states of a processor P based on function L D . Then weformally deﬁne the notion of QED test and show that forcorrect processors, QED tests preserve QED-consistency. Thisresult is key to the proof of the soundness in Section VI below. Deﬁnition 11 (QED-Consistency) . A state s is QED-consistent ,written

QEDcons ( s ) , iff ∀ l O ∈ L O . s ( l O ) = s ( L D ( l O )) . QED-consistency is based on checking the architectural partof a state. An equivalent condition can be formulated in termsof duplicate locations: ∀ l D ∈ L D . s ( l D ) = s ( L D − ( l D )) . Deﬁnition 12 (QED test) . An instruction sequence i is a QEDtest if i = i O :: Dup ( i O ) for some sequence i O of originalinstructions. We link the abstract speciﬁcation

Spec to the semanticsof original and duplicate instructions. This way, we obtaina notion of functional congruence that readily follows as aspecial case from (4).

Corollary 1 (Functional Congruence: Duplicate Instructions) . Given i O ∈ I O and i D ∈ I D with i D = Dup ( i O ) , the followingholds for all states s , s , s (cid:48) , and s (cid:48)(cid:48) : (cid:2) Spec ( s , i O , s (cid:48) ) ∧ Spec ( s , i D , s (cid:48)(cid:48) ) ∧ s ( L in ( i O )) = s ( L D ( L in ( i O ))) (cid:3) → s (cid:48) ( L out ( i O )) = s (cid:48)(cid:48) ( L D ( L out ( i O ))) Corollary 1 states that an original instruction i O produces thesame value at its output location as its duplicate instruction i D = Dup ( i O ) , provided that these instructions execute instates where the values at the respective input locations match.We generalize Corollary 1 to show that after executinga pair of original and duplicate instructions, the values at all original locations match the values at the correspondingduplicate locations, assuming those values also matched beforeexecuting the instructions. Lemma 1 (cf. Corollary 1) . Given i O ∈ I O and i D ∈ I D with i D = Dup ( i O ) , the following holds for all states s , s , s (cid:48) ,and s (cid:48)(cid:48) : (cid:2) Spec ( s , i O , s (cid:48) ) ∧ Spec ( s ,i D , s (cid:48)(cid:48) ) ∧∀ l O ∈ L O . s ( l O ) = s ( L D ( l O )) (cid:3) →∀ l O ∈ L O . s (cid:48) ( l O ) = s (cid:48)(cid:48) ( L D ( l O )) Proof.

See appendix.Lemma 1 leads to an important result that we need to provesoundness of SQED (Lemma 3 below): executing a QED test i starting in a QED-consistent state results in a QED-consistentﬁnal state if all instructions in i execute according to theabstract speciﬁcation Spec (cf. Fig. 1b).

Lemma 2 (QED-Consistency and QED tests) . Let i = (cid:104) i , . . . , i n (cid:105) be a QED test, let (cid:104) s , . . . , s n (cid:105) be a sequenceof n + 1 states, and let Spec be some abstract speciﬁcationrelation. Then,

QEDcons ( s ) ∧ (cid:0) n − (cid:94) j :=0 Spec ( s j , i j +1 , s j +1 ) (cid:1) → QEDcons ( s n ) Proof.

Assuming the antecedent, let l O ∈ L O be arbitrary butﬁxed with l D = L D ( l O ) . By repeated application of (2), wederive s ( l D ) = s ( l D ) = . . . = s n ( l D ) , and hence: s ( l D ) = s n ( l D ) (5)by transitivity. By repeated application of (3), we derive: s n ( l O ) = s n ( l O ) (6)5ow, QEDcons ( s ) implies s ( l O ) = s ( L D ( l O )) , fromwhich it follows by (5) that s ( l O ) = s n ( L D ( l O )) . Byrepeated application of Lemma 1, we can next derive s j ( l O ) = s n + j ( L D ( l O )) for ≤ j ≤ n , and in particular, s n ( l O ) = s n ( L D ( l O )) . Finally, by applying (6), we get s n ( l O ) = s n ( L D ( l O )) . Since l O was chosen arbitrarily, QEDcons ( s n ) holds.VI. S OUNDNESS AND C ONDITIONAL C OMPLETENESS

SQED checks a processor P for self-consistency by execut-ing QED tests and checking QED-consistency (cf. Fig 1a). Wenow deﬁne the correctness of P in terms of QED tests that,when executed, always result in QED-consistent states. Thisway, we establish a correspondence between counterexamplesto QED-consistency and bugs in P . We then prove our mainresults (Theorem 1) related to the bug-ﬁnding capabilities ofSQED, i.e., soundness and conditional completeness. Deﬁnition 13 (Failing and Succeeding QED Tests) . Let i bea QED test, s ∈ S I an initial state such that QEDcons ( s ) holds, and let s = T ( s , i ) . We say that: • QED test i fails if ¬ QEDcons ( s ) . • QED test i succeeds if QEDcons ( s ) . Deﬁnition 14 (Processor QED-Consistency) . A processor P is QED-consistent if all possible QED tests succeed.

Deﬁnition 15 (Processor QED-Inconsistency) . A processor P is QED-inconsistent if some QED test fails.

Lemma 3.

Let P be a processor. If P is QED-inconsistent,then P is not correct with respect to any abstract speciﬁcationrelation.Proof. Let i be a failing QED test for P and assume that proces-sor P is correct with respect to some abstract speciﬁcation rela-tion Spec . By Lemma 2, we conclude

QEDcons ( s n ) , whichcontradicts the assumption that i is a failing QED test.Importantly, Lemma 3 holds regardless of what the actualspeciﬁcation relation Spec is, i.e., it is independent of

Spec and the opcode speciﬁcation function

Spec op (Deﬁnition 6).Lemma 3 shows that SQED is a sound technique: any errorreported by a failing QED test is in fact a real bug in the system.It is more challenging to determine the degree to which SQEDis complete , that is, for which bugs do there exist failing QEDtests? We address this question next.Suppose that B = (cid:104) i b , S b (cid:105) is a bug with respect to a speciﬁca-tion Spec in a processor P , where i b = ( op b , l bout , ( l bin , l bin )) .A bug-speciﬁc QED test for B is a QED test that sets upthe conditions for and includes the activation of the bug. ByDeﬁnition 8, if i b is executed in P starting from any statein S b , the speciﬁcation is violated. That is, for each s b ∈ S b , ¬ Spec ( s b , i b , T ( s b , i b )) . Let s = T ( s b , i b ) . According to (1),there are two ways the speciﬁcation can be violated. Either:(A) the value in the output location of i b is different from thatrequired by Spec , i.e.: s ( l bout ) (cid:54) = Spec op b ( s b ( l bin ) , s b ( l bin )) ,which we call a type-A bug ; or (B) the value in some other, non-output location l bad is not preserved, i.e.: s ( l bad ) (cid:54) = s b ( l bad ) for some l bad (cid:54) = l bout , which we call a type-B bug . We nowdeﬁne a bug-speciﬁc QED test formally. Deﬁnition 16 (Bug-Speciﬁc QED Test) . Let B = (cid:104) i b , S b (cid:105) be a bug in P with respect to Spec , where i b = ( op b , l bout , ( l bin , l bin )) . The instruction sequence i = (cid:104) i , . . . , i n , i n +1 , . . . , i n (cid:105) is a bug-speciﬁc QED test for B ifthe following conditions hold: i n +1 = i b . i is a QED test for some L D , i.e. for ≤ k ≤ n , i n + k = Dup ( i k ) . In particular, i = ( op b , l out , ( l in , l in )) , with ( l in , l in , l out ) = L − D (( l bin , l bin , l bout )) . There exists a path s ∈ S n from s ∈ S I with QEDcons ( s ) , such that s = T ( s , i ) = (cid:104) s , . . . , s n , s n +1 , . . . , s n (cid:105) , where s n ∈ S b . Spec ( s , i , s ) . Additionally, we need three more conditions that dependon the bug types:

Case A: If i b is a type-A bug with respect to s n , i.e. s n +1 ( l bout ) (cid:54) = Spec op b ( s n ( l bin ) , s n ( l bin )) , then let l orig = l out and l dup = l bout . • We then require: – s n +1 ( l dup ) = s n ( l dup ) , – s ( l orig ) = s n ( l orig ) , – s ( L in ( i b )) = s n ( L in ( i b )) . Case B: If i b is a type-B bug with respect to s n , i.e. s n ( l bad ) (cid:54) = s n +1 ( l bad ) for some l bad (cid:54) = l bout , then let l orig = L − D ( l bad ) with l orig (cid:54) = l out and l dup = l bad . • We then require: – s n +1 ( l dup ) = s n ( l dup ) , – s ( l orig ) = s n ( l orig ) . – s ( l dup ) = s n ( l dup ) , Clearly, it is always possible to satisfy the ﬁrst two conditionsby declaring the buggy instruction i b to be the duplicate of i with respect to some function L D . Moreover, if we restrict ourattention to single-instruction correct processors, then the fourthcondition always holds as well. This ﬁts in well with the statedintended role of SQED which is to ﬁnd sequence-dependentbugs, rather than single-instruction bugs.Understanding when the remaining conditions 3 and 5 holdis more complicated. We must ﬁnd some instruction sequence i ∗ = (cid:104) i . . . i n (cid:105) that can transition P from the state s followingthe execution of i to one of the bug-triggering states in S b ,i.e., s n . Often it is reasonable to assume that P is stronglyconnected , i.e., that there always exists an instruction sequencethat can transition from one reachable state to another. This isalmost enough to ensure the existence of i ∗ . However, thereare a few other restrictions on i ∗ to satisfy Deﬁnition 16.First, i ∗ must consist of only original instructions to satisfythe deﬁnition of a QED test. We are free to choose L D to beanything that works, so the main restriction is that i ∗ cannotuse any instructions referencing locations that are used by i b ,i.e., l bin , l bin , or l bout . Note that we deﬁned i n +1 = i b to bethe ﬁrst duplicate instruction. This ends up being the mostsevere restriction on i ∗ because it means that instructions in i ∗ i b . We discusssome mitigations to this restriction in Section VI-A.Somewhat surprisingly, the three requirements in condition 5are not very severe, as we now explain. For both type-A andtype-B bugs, locations l orig and l dup are an original locationand its duplicate, respectively, that will hold inconsistent valueswhen the QED test i fails. For type-A bugs, l orig holds thecorrect output value of i and l dup holds the incorrect outputvalue of i b . For type-B bugs, l dup holds the value of location l bad that is incorrectly modiﬁed when i b is executed in state s n ,and l orig is the original location that corresponds to l dup = l bad .The ﬁrst requirement s n +1 ( l dup ) = s n ( l dup ) means thatthe duplicate sequence Dup ( i ∗ ) of i ∗ in the QED test has topreserve the value of l dup in s n +1 also in the ﬁnal state s n .Further, since l orig = L − D ( l dup ) , this also imposes restrictionson the modiﬁcations that i ∗ can make to l orig . However, as thisis just one original location, it is unlikely that every possible i ∗ would need to modify it to get to some bug-triggering state s n .The second requirement is s ( l orig ) = s n ( l orig ) . For similarreasons, it is unlikely that i ∗ would need to modify l orig , andthe duplicate sequence Dup ( i ∗ ) of i ∗ should not modify iteither, since it is an original location and original locationsshould be left alone by duplicate instructions. Although thebuggy instruction i b might modify l orig if it has more thanone bug effect, we may be able to choose the locations of i and L D differently to avoid this.Finally, the last requirement of condition 5 depends on thetwo cases A and B. In both cases, we require that i ∗ does notmodify certain duplicate locations: the input locations L in ( i b ) of i b (A) and location l dup that is incorrectly modiﬁed by i b (B). Sequence i ∗ should not modify any duplicate locations asit is composed of original instructions. Note that we do nothave to make the strong assumption that i ∗ executes accordingto its speciﬁcation, only that it avoids corrupting a few keylocations. Given that we have a lot of freedom in choosing L D and hence the locations of i , these requirements are likely tobe satisﬁable if there are some degrees of freedom in choosinga path to one of the bug-triggering states.We now prove our conditional completeness property, namelythat if a bug-speciﬁc QED test i exists, then i fails. Lemma 4.

Let P be a processor with a bug B = (cid:104) i b , S b (cid:105) with respect to speciﬁcation Spec , for which there exists abug-speciﬁc QED test i . Then i fails.Proof. Let B = (cid:104) i b , S b (cid:105) be a bug and i be a bug-speciﬁc QED test for B . By Deﬁnition 16 we have i = (cid:104) i , . . . , i n , i n +1 , . . . , i n (cid:105) and s = T ( s , i ) = (cid:104) s , s , . . . , s n , s n +1 , . . . , s n (cid:105) , where s n ∈ S b and i b = i n +1 ,and QEDcons ( s ) holds. We show that ¬ QEDcons ( s n ) holds by showing that s n ( l orig ) (cid:54) = s n ( l dup ) . We distinguishthe two cases A and B in Deﬁnition 16. Case A.

Since

QEDcons ( s ) and Dup ( i ) = i b , we have s ( L in ( i )) = s ( L in ( i b )) (7) From the third requirement of Case A in Deﬁnition 16, wehave s ( L in ( i b )) = s n ( L in ( i b )) , so it follows that, s ( L in ( i )) = s n ( L in ( i b )) (8)By (8) and since op ( i ) = op ( i b ) , also Spec op ( i ) ( s ( L in ( i ))) = Spec op ( i b ) ( s n ( L in ( i b ))) (9)Since Spec ( s , i , s ) by Deﬁnition 16, we have s ( L out ( i )) = Spec op ( i ) ( s ( L in ( i ))) (10)Since we are in Case A, we have from Deﬁnition 16 that l orig = L out ( i ) , and from the second requirement of Case A,we have s ( l orig ) = s n ( l orig ) , so it follows that, s n ( l orig ) = Spec op ( i ) ( s ( L in ( i ))) (11)Since i b fails in state s n , we have that, s n +1 ( L out ( i b )) (cid:54) = Spec op ( i b ) ( s n ( L in ( i b ))) (12)Again, from Case A in Deﬁnition 16, we have l dup = L out ( i b ) , and from the ﬁrst requirement of Case A, we have s n +1 ( l dup ) = s n ( l dup ) , so it follows that, s n ( l dup ) (cid:54) = Spec op ( i b ) ( s n ( L in ( i b ))) (13)Finally, (9) and (11) give us, s n ( l orig ) = Spec op ( i b ) ( s n ( L in ( i b ))) (14)But then (13) and (14) imply s n ( l orig ) (cid:54) = s n ( l dup ) , andhence ¬ QEDcons ( s n ) . Case B.

See appendix.

Theorem 1. • SQED is sound (Lemma 3). • SQED is complete for bugs for which a bug-speciﬁc QEDtest exists (Lemma 4).

Theorem 1 is relevant for practical applications of SQED.Referring to the high-level workﬂow shown in Fig. 1a, BMCsymbolically explores all possible QED tests up to bound n for a particular ﬁxed mapping L D . If a failing QED test i isfound, then by the soundness of SQED, i corresponds to abug in the processor. By completeness, if there exists a bug forwhich a bug-speciﬁc QED test i exists, then with a sufﬁcientlylarge bound n , BMC will ﬁnd a sequence i that will fail. A. Extensions

We now consider variants of QED tests that cover a largerclass of bugs (i.e. bugs that cannot be detected by a bug-speciﬁcQED test). Ultimately, with hardware support we obtain afamily of QED tests which, together with single-instructioncorrectness, results in a complete variant of SQED (Theorem 2).The main limitation of bug-speciﬁc QED tests arises from thefact that QED tests consist of a sequence of original instructionsfollowed by duplicate ones. This makes it impossible to set upa bug-speciﬁc QED test for an important class of forwarding-logic bugs (a simple reﬁnement of our model can be used for theimportant case of pipelined systems). To see why, consider that7 bug-triggering state s n ∈ S b must be reached by executing asequence of original instructions. The buggy instruction, whichis a duplicate , is executed in state s n and would have to reada value from some original location written previously.To resolve this limitation, ﬁrst note that there is anotherway that SQED can ﬁnd bugs, namely by ﬁnding QED testsfor which the bug occurs during the original sequence, butnot during the duplicate one. This kind of QED test is muchmore effective with a simple extension to allow no-operationinstructions (a trick also employed in [11]). To formalize this,we ﬁrst deﬁne a set N of no-operation instructions (NOPs). Deﬁnition 17.

Let N be the set of instructions such that, forevery state ( s a , s a ) , if i nop ∈ N , then T (( s a , s a ) , i nop ) =( s a , s (cid:48) a ) for some s (cid:48) a ∈ S a . An instruction in N may change the non-architectural part ofa state, but not the architectural part. Deﬁnition 18. An extended QED test is any sequence ofinstructions obtained from a standard QED test by insertingzero or more instructions from N anywhere in the sequence. Extended QED tests enjoy the same properties as standardQED tests. In particular, an appropriately lifted version ofLemma 2 holds and the notions of failing and succeeding QEDtests can be lifted to extended QED tests in the obvious way.

Deﬁnition 19 (Bug-Hunting Extended QED Test) . Let P bea single-instruction correct processor with at least one bug.The instruction sequence i is a bug-hunting extended QEDtest with a bug-preﬁx of size k and initial state s for P ifthe following conditions hold: There is some bug B = (cid:104) i b , S b (cid:105) in P such that T ( s , (cid:104) i , . . . , i k − (cid:105) ) ∈ S b and i k = i b i is an extended QED test i k is an original instruction, and i k +1 = Dup ( i ) Unlike a bug-speciﬁc QED test, a bug-hunting extended QEDtest is not guaranteed to fail. It starts with a bug-triggeringsequence of length k , and then ﬁnishes with a modiﬁedduplicate sequence which may add (or subtract) NOPs from N . The NOPs can be used to change the timing betweenany interdependent instructions, making it more likely that theduplicate sequence will produce a correct result, especially ifthe bug depends on forwarding-logic. One can show (omittedfor lack of space) that for a general class of forwarding-logicbugs, there does always exist an extended QED test that fails.Another QED test extension is to allow original and duplicateinstructions to be interleaved [10], rather than requiring thatall original instructions precede all duplicate instructions [8]. Again, it is straightforward to show that this extension preservesLemma 2. Clearly, the set of bugs that can be found by adding The bug in Example 3 can be detected by executing the QED test i = i O, , i D, :: i O, , i D, , which interleaves original and duplicate instructions.The subsequence i O, , i D, of two back-to-back MULs causes i D, to producean incorrect result at its output location l . The ﬁnal state is QED-inconsistentsince the output location l of i O, holds the correct value, while l holdsan incorrect one. interleaving are a strict superset of those that can be foundwithout. In practice, implementations of SQED search for allpossible extended QED tests with interleaving. Empirically,case studies have not turned up any (non-single-instruction)bugs that cannot be found with this combination. However,one can construct pathological systems with bugs that cannotbe found by such QED tests. We address these cases next. B. Hardware Extensions

With hardware support, stronger guarantees can be achievedthat lead to our ﬁnal completeness result (Theorem 2). Weﬁrst introduce a soft-reset instruction, which transitions thenon-architectural part of a state to the initial non-architecturalstate s a ,I without changing the architectural part. Then wedeﬁne a variant of bug-hunting extended QED tests wherewe insert soft-reset instructions in the sequence of duplicateinstructions. This way, all duplicate instructions execute in aninitial state and hence execute according to the speciﬁcation forsingle-instruction correct processors. The resulting QED testalways fails, in contrast to a bug-hunting extended QED test. Deﬁnition 20. i r is a soft-reset instruction for P if for everystate ( s a , s a ) , T (( s a , s a ) , i r ) = ( s a , s a ,I ) . It is easy to see that i r ∈ N . Deﬁnition 21 (Bug-Speciﬁc Soft-Reset QED Test) . Let P besingle-instruction correct with at least one bug B = (cid:104) i b , S b (cid:105) .The instruction sequence i = (cid:104) i , . . . i n (cid:105) is a bug-speciﬁcsoft-reset QED test for P if the following conditions hold: i is a bug-hunting extended QED test for P with aminimal bug-preﬁx of size k ≥ and initial state s Let s = T ( s , i ) . Then, ∀ l ∈ L D . s k − ( l ) = s k ( l ) , i.e., i b = i k does not corrupt any duplicate location n = 3 k For each ≤ j ≤ k , i k +2 j − = i r Lemma 5. If P is single-instruction correct and has a bug-speciﬁc soft-reset QED test i , then i fails.Proof. See appendix.There are still a few (pathological) ways in which a bug maybe missed by searching for all possible soft-reset QED tests.First, there may be no triggering sequence starting from anyQED-consistent state. Second, it could be that the triggeringsequence for a bug requires using more than half of all thelocations, making it impossible to divide the locations amongoriginal and duplicate instructions. Finally, it could be thatthe bug always corrupts duplicate locations for every possiblecandidate sequence. These can all be remedied by adding hardreset instructions, which reset P to a speciﬁc initial state. Deﬁnition 22.

The set { i R,s I | s I ∈ S I } is a family of hardreset instructions for P if for every state s , T ( s, i R,s I ) = s I . Deﬁnition 23.

Let P be a processor. Then i = (cid:104) i . . . i k +2 (cid:105) is a bug-speciﬁc hard-reset QED test with bug-preﬁx size k and initial state s I for P if the following conditions hold: k ≥ (cid:104) i . . . i k (cid:105) reach and trigger a bug B = (cid:104) i b , S b (cid:105) in P starting from s I , where i k = i b i k +1 = i R,s I (cid:104) i k +2 . . . i k (cid:105) = (cid:104) i . . . i k − (cid:105) i k +1 = i r i k +2 = i k Notice that there is no notion of duplication for a hard-resetQED test. Instead, the exact same sequence is executed twiceexcept that there is a hard reset in between and a soft resetright before the last instruction. Hard-reset QED tests also usea slightly different notion of success and failure.

Deﬁnition 24.

Let i be a bug-speciﬁc hard-reset QED testwith bug-preﬁx size k and initial state s I , and let s = T ( s I , i ) . • i succeeds if s k ( l ) = s k +2 ( l ) for every location l ∈ L . • i fails if s k ( l ) (cid:54) = s k +2 ( l ) for some location l ∈ L . The combination of single-instruction correctness checking andexhaustive search for hard-reset QED tests is complete.

Theorem 2. If P is single-instruction correct and has nofailing bug-speciﬁc hard-reset QED tests, then it is correct.Proof. See appendix.VII. R

ELATED W ORK

Assertion-based formal veriﬁcation techniques using theoremproving or (bounded) model checking, e.g., [1], [16]–[18],require implementation-speciﬁc, manually-written properties.In contrast to that, symbolic quick error detection (SQED) [7]–[10] is based on a universal self-consistency property.In an early application of self-consistency checking for pro-cessor veriﬁcation without a speciﬁcation [11], given instructionsequences are transformed by, e.g., inserting NOPs. The originaland the modiﬁed instruction sequence are expected to producethe same result. As a formal foundation, this approach relies onformulating and explicitly computing an equivalence relationover states, which is not needed with SQED.SQED originates from quick error detection (QED) , a post-silicon validation technique [19]–[21]. QED is highly effectivein reducing the length of existing bug traces (i.e., instructionsequences) in post-silicon debugging of processor cores. Tothis end, existing bug traces are systematically transformed into

QED tests by techniques that (among others) include instructionduplication [22]. SQED exhaustively searches for minimal-length QED tests using BMC for pre-silicon veriﬁcation. It isalso applicable to post-silicon validation. SQED was extendedto operate with symbolic initial states [12], [23] to overcomethe potential limitations of BMC when unrolling the transitionrelation of a design starting in a concrete initial state.SQED employs the principle of self-consistency based on amathematical interpretation of instructions as functions. Thatprinciple is also applied by accelerator quick error detection (A-QED) [24], a formal pre-silicon veriﬁcation technique for HWaccelerator designs. A-QED checks the functions implementedby an accelerator for functional consistency and, like SQED,does not require a formal speciﬁcation.

Unique program execution checking [25] relies on a particularvariant of self-consistency to check security vulnerabilities ofprocessor designs for covert channel attacks. In the contextof security, self-consistency is also applied to verify secureinformation ﬂow by self-composition of programs [26]–[29].Several approaches, including both formal and simulation-based approaches, exist for checking single-instruction (SI)correctness cf. [9], [23], [30]. Checking SI correctness iscomplementary to checking self-consistency using SQED andis also much more tractable. In a formal approach, a propertycorresponding to

Spec op (based on the ISA) is written for eachopcode op ∈ Op , and the model checker is used to ensure thatthe property holds when starting from any initial state. Becausethe approach is restricted to initial states and only a singleinstruction execution, it is much simpler to specify and checkthan would be a property specifying the full correctness of P .Efﬁcient specialized approaches exist for checking multiplierunits [31]–[34], which is computationally hard.VIII. C ONCLUSION AND F UTURE W ORK

We laid a formal foundation for symbolic quick errordetection (SQED) and presented a theoretical framework toreason about its bug-ﬁnding capabilities. In our framework,we proved soundness as well as (conditional) completeness,thereby closing a gap in the theoretical understanding of SQED.Soundness implies that SQED does not produce spuriouscounterexamples, i.e., any counterexample to QED-consistencyreported by SQED corresponds to an actual bug in the design.For completeness, we characterized a large class of bugs thatcan be detected by failing QED tests under modest assumptionsabout these bugs. We also identiﬁed several QED test extensionsbased on executing no-operation and reset instructions. Forthese extensions, we proved even stronger completeness guar-antees, ultimately leading to a variant of SQED that, togetherwith single-instruction correctness, is complete.As future work, it would be valuable to extend our frameworkto consider variants of SQED that operate with more fullysymbolic initial states [12], [23]. The challenge will be toidentify how this can be done while guaranteeing no spuriouscounterexamples. For practical applications, our theoreticalresults provide valuable insights. For example, in presentimplementations of SQED [9], [10], the ﬂexibility to partitionregister/memory locations into sets of original and duplicatelocations and to select the bijective mapping between thesetwo sets has not yet been explored. Similarly, it is promisingto combine standard QED tests and the specialized extensionswe presented in a uniform practical tool framework. Featureslike soft/hard reset instructions could either be implementedin HW in a design-for-veriﬁcation approach or in softwareinside a model checker. In another research direction, we planto extend our framework to model the detection of deadlocksusing SQED, cf. [7], and prove related theoretical guarantees.

Acknowledgments.

We thank Karthik Ganesan and JohnTigar Humphries for helpful initial discussions and the anony-mous reviewers for their feedback.9

EFERENCES[1] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic ModelChecking without BDDs,” in

Proc. TACAS , ser. LNCS, vol. 1579.Springer, 1999, pp. 193–207.[2] S. Katz, O. Grumberg, and D. Geist, “"Have I written enough Properties?"- A Method of Comparison between Speciﬁcation and Implementation,”in

Proc. CHARME , ser. LNCS, vol. 1703. Springer, 1999, pp. 280–297.[3] H. Chockler, O. Kupferman, and M. Y. Vardi, “Coverage Metrics forTemporal Logic Model Checking,” in

Proc. TACAS , ser. LNCS, vol. 2031.Springer, 2001, pp. 528–542.[4] K. Claessen, “A Coverage Analysis for Safety Property Lists,” in

Proc. FMCAD . IEEE, 2007, pp. 139–145.[5] D. Große, U. Kühne, and R. Drechsler, “Estimating functional coveragein bounded model checking,” in

Proc. DATE . EDA Consortium, SanJose, CA, USA, 2007, pp. 1176–1181.[6] H. Chockler, D. Kroening, and M. Purandare, “Coverage in interpolation-based model checking,” in

Proc. DAC . ACM, 2010, pp. 182–187.[7] D. Lin, E. Singh, C. Barrett, and S. Mitra, “A structured approach topost-silicon validation and debug using symbolic quick error detection,”in

Proc. ITC . IEEE, 2015, pp. 1–10.[8] E. Singh, D. Lin, C. Barrett, and S. Mitra, “Logic bug detection andlocalization using symbolic quick error detection,”

IEEE Transactionson Computer-Aided Design of Integrated Circuits and Systems , pp. 1–1,2018.[9] E. Singh, K. Devarajegowda, S. Simon, R. Schnieder, K. Ganesan, M. R.Fadiheh, D. Stoffel, W. Kunz, C. W. Barrett, W. Ecker, and S. Mitra,“Symbolic QED Pre-Silicon Veriﬁcation for Automotive MicrocontrollerCores: Industrial Case Study,” in

Proc. DATE . IEEE, 2019, pp. 1000–1005.[10] F. Lonsing, K. Ganesan, M. Mann, S. S. Nuthakki, E. Singh, M. Srouji,Y. Yang, S. Mitra, and C. W. Barrett, “Unlocking the Power of FormalHardware Veriﬁcation with CoSA and Symbolic QED: Invited Paper,”in

Proc ICCAD . ACM, 2019, pp. 1–8.[11] R. B. Jones, C. H. Seger, and D. L. Dill, “Self-Consistency Checking,”in

Proc. FMCAD , ser. LNCS, vol. 1166. Springer, 1996, pp. 159–171.[12] M. R. Fadiheh, J. Urdahl, S. S. Nuthakki, S. Mitra, C. Barrett, D. Stoffel,and W. Kunz, “Symbolic quick error detection using symbolic initial statefor pre-silicon veriﬁcation,” in

Proc. DATE . IEEE, 2018, pp. 55–60.[13] R. M. Keller, “A Fundamental Theorem of Asynchronous Parallel Com-putation,” in

Parallel Processing, Proc. Sagamore Computer Conference ,ser. LNCS, vol. 24. Springer, 1974, pp. 102–112.[14] R. M. Keller, “Formal Veriﬁcation of Parallel Programs,”

Commun. ACM ,vol. 19, no. 7, pp. 371–384, 1976.[15] B. Huang, H. Zhang, P. Subramanyan, Y. Vizel, A. Gupta, and S. Malik,“Instruction-Level Abstraction (ILA): A Uniform Speciﬁcation for System-on-Chip (SoC) Veriﬁcation,”

ACM Trans. Design Autom. Electr. Syst. ,vol. 24, no. 1, pp. 10:1–10:24, 2019.[16] W. A. Hunt Jr., “Microprocessor design veriﬁcation,”

J. Autom. Reasoning ,vol. 5, no. 4, pp. 429–460, 1989.[17] J. R. Burch and D. L. Dill, “Automatic Veriﬁcation of PipelinedMicroprocessor Control,” in

Proc. CAV , ser. LNCS, vol. 818. Springer,1994, pp. 68–80.[18] A. Biere, E. M. Clarke, R. Raimi, and Y. Zhu, “Veriﬁying SafetyProperties of a Power PC Microprocessor Using Symbolic ModelChecking without BDDs,” in

Proc. CAV , ser. LNCS, vol. 1633. Springer,1999, pp. 60–71. [19] T. Hong, Y. Li, S. Park, D. Mui, D. Lin, Z. A. Kaleq, N. Hakim,H. Naeimi, D. S. Gardner, and S. Mitra, “QED: Quick Error Detectiontests for effective post-silicon validation,” in

Proc. ITC . IEEE, 2010,pp. 154–163.[20] D. Lin, T. Hong, Y. Li, F. Fallah, D. S. Gardner, N. Hakim, andS. Mitra, “Overcoming post-silicon validation challenges through quickerror detection (QED),” in

Proc. DATE . EDA Consortium San Jose,CA, USA / ACM DL, 2013, pp. 320–325.[21] D. Lin, T. Hong, Y. Li, E. S, S. Kumar, F. Fallah, N. Hakim, D. S.Gardner, and S. Mitra, “Effective Post-Silicon Validation of System-on-Chips Using Quick Error Detection,”

IEEE Trans. on CAD of IntegratedCircuits and Systems , vol. 33, no. 10, pp. 1573–1590, 2014.[22] N. Oh, P. P. Shirvani, and E. J. McCluskey, “Error detection by duplicatedinstructions in super-scalar processors,”

IEEE Trans. Reliability , vol. 51,no. 1, pp. 63–75, 2002.[23] K. Devarajegowda, M. R. Fadiheh, E. Singh, C. Barrett, S. Mitra,W. Ecker, D. Stoffel, and W. Kunz, “Gap-free Processor Veriﬁcation byS QED and Property Generation,” in

Proc. DATE . IEEE, 2020.[24] E. Singh, F. Lonsing, S. Chattopadhyay, M. Strange, P. Wei, X. Zhang,Y. Zhou, D. Chen, J. Cong, P. Raina, Z. Zhang, C. Barrett, and S. Mitra,“A-QED Veriﬁcation of Hardware Accelerators,” in

Proc. DAC, to appear .ACM, 2020.[25] M. R. Fadiheh, D. Stoffel, C. W. Barrett, S. Mitra, and W. Kunz,“Processor Hardware Security Vulnerabilities and their Detection byUnique Program Execution Checking,” in

Proc. DATE . IEEE, 2019,pp. 994–999.[26] G. Barthe, P. R. D’Argenio, and T. Rezk, “Secure Information Flow bySelf-Composition,” in

Proc. CSFW-17 . IEEE, 2004, pp. 100–114.[27] G. Barthe, J. M. Crespo, and C. Kunz, “Relational Veriﬁcation UsingProduct Programs,” in

Proc. FM , ser. LNCS, vol. 6664. Springer, 2011,pp. 200–214.[28] J. B. Almeida, M. Barbosa, G. Barthe, F. Dupressoir, and M. Emmi,“Verifying Constant-Time Implementations,” in

Proc. USENIX . USENIXAssociation, 2016, pp. 53–70.[29] W. Yang, Y. Vizel, P. Subramanyan, A. Gupta, and S. Malik, “LazySelf-composition for Security Veriﬁcation,” in

Proc. CAV , ser. LNCS,vol. 10982. Springer, 2018, pp. 136–156.[30] A. Reid, R. Chen, A. Deligiannis, D. Gilday, D. Hoyes, W. Keen,A. Pathirane, O. Shepherd, P. Vrabel, and A. Zaidi, “End-to-EndVeriﬁcation of Processors with ISA-Formal,” in

Proc. CAV , ser. LNCS,vol. 9780. Springer, 2016, pp. 42–58.[31] U. Krautz, M. Wedler, W. Kunz, K. Weber, C. Jacobi, and M. Pﬂanz,“Verifying full-custom multipliers by Boolean equivalence checking andan arithmetic bit level proof,” in

ASP-DAC . IEEE, 2008, pp. 398–403.[32] A. A. R. Sayed-Ahmed, D. Große, U. Kühne, M. Soeken, and R. Drech-sler, “Formal veriﬁcation of integer multipliers by combining Gröbnerbasis with logic reduction,” in

Proc. DATE , 2016, pp. 1048–1053.[33] D. Ritirc, A. Biere, and M. Kauers, “Column-wise veriﬁcation ofmultipliers using computer algebra,” in

Proc. FMCAD , 2017, pp. 23–30.[34] D. Kaufmann, A. Biere, and M. Kauers, “Verifying Large Multipliersby Combining SAT and Computer Algebra,” in

Proc. FMCAD . IEEE,2019, pp. 28–36. PPENDIX AP ROOFS

Proof of Lemma 1.

Assume that the antecedent of the im-plication holds, and let l O ∈ L O be an arbitrary originalmemory location. If l O = L out ( i O ) , then s (cid:48) ( L out ( i O )) = s (cid:48)(cid:48) ( L D ( L out ( i O ))) by Corollary 1.Suppose, on the other hand, that l O (cid:54) = L out ( i O ) . Let l D = L D ( l O ) be the corresponding duplicate location. Bythe injectivity of L D , we have l D (cid:54) = L D ( L out ( i O )) , andthus l D (cid:54) = L out ( i D ) . We can thus conclude from (1) that s ( l O ) = s (cid:48) ( l O ) and s ( l D ) = s (cid:48)(cid:48) ( l D ) .Finally, since s ( l O ) = s ( l D ) by assumption, we derive s (cid:48) ( l O ) = s (cid:48)(cid:48) ( l D ) , that is, s (cid:48) ( l O ) = s (cid:48)(cid:48) ( L D ( l O )) . Proof of Case B of Lemma 4.

Since

QEDcons ( s ) , we have s ( l orig ) = s ( l dup ) (15)Since Spec ( s , i , s ) by Deﬁnition 16, we have s ( l orig ) = s ( l orig ) and s ( l dup ) = s ( l dup ) , and so it follows that, s ( l orig ) = s ( l dup ) (16)Due to the requirements in Case B of Deﬁnition 16, we have s n +1 ( l dup ) = s n ( l dup ) (17) s ( l orig ) = s n ( l orig ) (18) s ( l dup ) = s n ( l dup ) (19)Now, because we are in Case B, we know that s n ( l dup ) (cid:54) = s n +1 ( l dup ) , so by (17), s n ( l dup ) (cid:54) = s n ( l dup ) (20)But (16), (18) and (19) give us: s n ( l dup ) = s n ( l orig ) (21)Thus, by (20) and (21), s n ( l dup ) (cid:54) = s n ( l orig ) (22)and hence ¬ QEDcons ( s n ) . Proof of Lemma 5.

Because k is minimal, we know that i . . . i k − all execute according to their speciﬁcation. For i k + j ,if j is odd, then it is a no-operation and therefore it changes nolocation values, and if j is even, then it executes according tospeciﬁcation because it is executing in an initial state. Let l besome original location whose value is incorrect after the buggyinstruction i k = i b executes (we know the buggy instructioncorrupts an original location because it does not corrupt aduplicate location), with l D = L D ( l ) . We consider a type-Bbug as in Deﬁnition 16 ﬁrst and assume that l is not the outputlocation of i k . Since s is QED-consistent, s ( l ) = s ( l D ) .Since instructions through k do not change duplicate locations,we have s ( l ) = s k ( l D ) . By repeated application of Lemma 1and by deﬁnition of no-operation instructions, we can thenconclude that s k − ( l ) = s k − ( l D ) . Then, because of the bug,it follows that s k ( l ) (cid:54) = s k ( l D ) . Finally, because none of the instructions after i k modify original locations, s k ( l ) = s k ( l ) ,so ¬ QEDcons ( s k ) .The case of a type-A bug where l is the output location of i k can be proved analogously. Proof of Theorem 2.