Verifying High-Level Latency-Insensitive Designs with Formal Model Checking
Steve Dai, Alicia Klinefelter, Haoxing Ren, Rangharajan Venkatesan, Ben Keller, Nathaniel Pinckney, Brucek Khailany
VVerifying High-Level Latency-Insensitive Designswith Formal Model Checking
Steve Dai, Alicia Klinefelter, Haoxing Ren, Rangharajan Venkatesan,Ben Keller, Nathaniel Pinckney, Brucek Khailany
NVIDIA
Abstract —Latency-insensitive design mitigates increasing in-terconnect delay and enables productive component reuse incomplex digital systems. This design style has been adoptedin high-level design flows because untimed functional blocksconnected through latency-insensitive interfaces provide a nat-ural communication abstraction. However, latency-insensitivedesign with high-level languages also introduces a unique setof verification challenges that jeopardize functional correctness.In particular, bugs due to invalid consumption of inputs anddeadlocks can be difficult to detect and debug with dynamicsimulation methods. To tackle these two classes of bugs, wepropose formal model checking methods to guarantee that ahigh-level latency-insensitive design is unaffected by invalid inputdata and is free of deadlock. We develop a well-structuredverification wrapper for each property to automatically constructthe corresponding formal model for checking. Our experimentsdemonstrate that the formal checks are effective in realistic bugscenarios from high-level designs.
I. I
NTRODUCTION
As modern SoC design challenges continue to motivatereuse of existing design blocks, latency-insensitive (LI) designhas emerged as a practical methodology for synchronizing pre-assembled modules under increasing pressure of lengtheninginterconnect delay [1], [2]. By exposing a valid-ready interfacefrom each module, LI design decouples the timing of intra-module computation from that of inter-module communica-tion to ensure robust functionality while tolerating arbitrarycommunication latency between modules. This methodologyenables flexible physical design implementation without im-pacting verification of individual components.In parallel with this trend, hardware designers have em-braced high-level languages for high-productivity VLSI de-sign. In particular, high-level synthesis (HLS) compilers canautomatically synthesize RTL from C++ models. Because anHLS compiler translates untimed functional blocks in softwareinto interconnected cycle-accurate hardware modules with cus-tomized throughput and latency, it is natural for HLS to adoptan LI-based composition of the modules to take advantage ofthe modularity and relaxed timing requirement of LI design.The confluence of HLS and LI design has enabled rapid designof large-scale chips using high-level languages [3], [4].Figure 1 shows a typical HLS design flow (on the left) forgenerating an RTL model from a C++ description. Designersgenerally rely on dynamic simulation of C++ and RTL modelsfor verifying their designs. The HLS design flow allowsdesigners to leverage C++ simulation for the bulk of theirdesign verification tasks and promises orders of magnitudespeedup over a conventional RTL flow [5]. Nevertheless, dueto inherent differences in timing models between C++ andRTL, C++ simulation must be complemented by RTL simula-tion to expose bugs that require cycle-accurate introspection. C ++ S i m u l a t i on C ++ S ou r c e V e r i f i ed C ++ H L S C o m p il a t i on H L S - gene r a t ed R T L R T L S i m u l a t i on V e r i f i ed R T L Invalid Consumptionand/or Deadlock?Formal Model CheckingWrapped RTL
Automated
Wrapper
Our Formal Verification Extensions
Fig. 1.
Typical HLS design and verification flow with our proposed formalextensions — Design is typically verified with dynamic simulations in C++and RTL. We extend verification with automated formal model checking.
The quality of dynamic simulation depends heavily onthe effectiveness of the set of chosen stimuli. These stimulirealistically represent only a small subspace of all possibleinputs and risk excluding difficult-to-anticipate corner cases.In addition, the LI property allows for arbitrary timing of inter-module interface, further requiring expansion of the verifica-tion coverage space to include a full range of input arrivaltimes and relative input ordering. With dynamic simulation,it is especially difficult to detect and eliminate design issuesresulting in invalid consumption of inputs and deadlock, whichconstitute two commonly occurring bugs in LI-based HLSdesign. To address these two classes of bugs, we proposeaugmenting a typical HLS design flow with formal verifica-tion extensions (right side of Figure 1 in bold). The majorcontributions of this work are: • We propose RTL formal model checking methods thatguarantee an LI-based HLS design is not affected by invalidinput data and is free of deadlock for all intended use cases. • We develop an automated flow to generate the verificationwrapper for each property and implement the correspondingformal checks without human intervention. • We demonstrate the effectiveness of the proposed verifica-tion models and flow in realistic industry and academic bugscenarios, as well as on a range of design blocks.The remainder of the paper is organized as follows: Sec-tion II reviews related work; Section III establishes the basicsfor LI design in HLS; Section IV describes the two classes ofcommon bugs we target and proposes two formal models todetect these bugs; Section V demonstrates the effectiveness ofour models, followed by conclusions in Section VI.II. R
ELATED W ORK
LI design refers to the correct-by-construction composi-tion of stallable computational processes that exchange datathrough communication channels in accordance with an ap-propriate LI protocol such as valid/ready or request/acknowl-edge [1], [6]. As long as each computational process (i.e.,functional module) is implemented correctly, the compositionwill also behave correctly regardless of the latency of thechannels. The introduction of the methodology has led to a r X i v : . [ c s . L O ] F e b void adder blocking() { while (1) { DataA = InA.Pop(); DataB = InB.Pop(); Out.Push(DataA+DataB); }} (a) Fig. 2.
Simple adder designin C++ — (a) Input ports areblocking. (b) Input ports arenon-blocking. void adder nonblocking() { statusA = false ; statusB = false ; while (1) { if (!statusA) statusA = InA.PopNB(DataA); if (!statusB) statusB = InB.PopNB(DataB); if (statusA && statusB) { Out.Push(DataA+DataB); statusA = false ; statusB = false ; }}} (b) the emergence of a family of LI protocols, followed by aset of dynamic simulation and formal verification techniquesto validate the correctness of these protocols [7], [8]. Ourwork is concerned with verifying designs implemented usingLI methodology, rather than validating the correctness of themethodology and protocols as in these previous works.Designers typically verify LI-based HLS designs with dy-namic simulation in C++ and RTL as they iterate on variousdesign changes. Conventionally, formal verification methodshave been applied to prove the correctness of C-to-RTL trans-formations by checking the equivalence of the model beforeand after transformation, rather than validating the correctnessof the C++ implementation itself [9], [10]. KAIROS leveragesformal equivalence checking to verify whether a modified HLSdesign is equivalent to the original (golden) design after incre-mental code modification or change in HLS optimizations [11].Unlike previous formal techniques for HLS, our methods donot target bugs caused by the transformations of the HLS tool.Instead, we target designer-induced bugs while not requiringa golden model as ground truth.III. LI I MPLEMENTATION IN
HLSIn this work, LI-based HLS designs are realized usingMatchLib, a high-level library of synthesizable port and chan-nel primitives in C++ implementing LI valid-ready protocolsfor HLS tools [12]. For an LI design in HLS, each functionalblock exposes an interface of directional ports from the Match-Lib library, shown as
InA , InB , and
Out in Figure 2(a). Theseports are then connected to MatchLib ports on other functionalblocks via MatchLib channels. As shown in Figure 2(a), eachport can read (pop) data from a channel (e.g.,
InA and
InB )or write (push) data to a channel (e.g.,
Out ).Transactions acting on a MatchLib port can be eitherblocking or non-blocking. Blocking communication preventssubsequent transactions from executing until the current reador write has succeeded, and can block forward progress.In contrast, non-blocking communication allows subsequenttransactions to execute regardless of whether the current reador write is successful, and cannot block forward progress. Fig-ure 2(b) implements the same adder design as in Figure 2(a),except with non-blocking input ports
InA and
InB . statusA and statusB in (b) are used to indicate whether DataA and
DataB respectively contain valid input data from the channels.Because of the use of a blocking pop in (a), Line 4 cannotbe executed until the
Pop in Line 3 successfully reads validdata from port
InA into local variable
DataA . However, withnon-blocking pop in (b), Line 6 can be executed after Line5 regardless of whether the
PopNB in Line 5 is successful. input [31:0] InA msg; input InA val; output InA rdy; output [31:0] Out msg; output Out val; input Out rdy;
Fig. 3.
Interface of simple adder design in Verilog after HLS —Valid/ready input port signals and output port signals, respectively.
The corresponding valid-ready interfaces in RTL for input port
InA and output port
Out generated from the C++ descriptionsin Figure 2 are shown in Figure 3, respectively. Blocking andnon-blocking ports share the same RTL interface but differ intheir internal blocking logic.It is important for an LI implementation to include non-blocking communication so that realistic and scalable designscan be expressed. Figure 4 illustrates a three-input one-outputprocess p . If p implements a multiplier, all three inputs mustbe available before the output is valid; however, if p is anarbiter, only one input needs to be available before producinga valid output. The multiplier could be implemented withblocking communication, while the arbiter must include non-blocking reads to be functional.While the decision to use blocking versus non-blockingcommunication is dependent on internal process details, eachcomes with its own pitfalls that result in the two classes of bugstargeted by this paper. Blocking communication creates aninherent wait-for relationship between design blocks and canlead to deadlock. Non-blocking communication requires cus-tom bookkeeping logic (e.g., Lines 4, 6, and 8 in Figure 2(b))to prevent undesirable consumption of invalid input data thatcan increase the chance of designer error. Mixing blocking andnon-blocking communication further complicates the design.Our work provides safeguards from these kinds of unintendedand undesirable consequences through automation, withoutextra burden on the HLS designer or verification engineer.IV. F ORMAL M ODEL C HECKING FOR
LI D
ESIGN
The first class of bugs we target involves consumptionof invalid inputs at the LI interface. When non-blockingcommunication is needed, a user typically writes custom book-keeping logic to manage the communication. This is a commoncause of errors, since improperly constrained non-blockingreads and writes can result in tainted updates to statefulelements by invalid input data. The risk of error increases withadditional non-blocking ports as designers attempt to managethe complex interaction among multiple instances of custombookkeeping logic while keeping track of how the sequentialC++ design entry will translate into parallel hardware ingenerated RTL. This class of bugs is difficult to detect withdynamic simulation because the designer-imposed constraintsare often buggy only under limited corner cases that are non-trivial to conceive ahead of time during test planning.The second class of bugs we target involves deadlock, whichmay arise due to a multitude of factors, including incorrectcapacities for communication channels, improper applicationor combination of blocking and non-blocking ports, latency-sensitive bookkeeping logic, or circular dependencies among op op I O O O (b) I I (a) p I O I I p Fig. 4.
Dataflow for a three-input one-output process — Internal details of pro-cess p , in addition to the dataflow model,determines whether input ports should beblocking or non-blocking. Ref. DUTTest
DUT D InARef D InBRef D InATest
Val
InA
Val B Val
OutRef D OutRef
Val
OutTest D OutTest ====
InBTest
Fig. 5.
Formal model for verifying invalid input insensitive property —Verifies if valid outputs are equivalent even under different invalid inputs. different blocks. It is difficult for designers to be completelymindful of all sources of deadlock during the design process,especially when the design entry is untimed and sequentialC++. Exposing potential deadlock scenarios with dynamicsimulation requires stressing the range of input arrival times(and therefore relative input orderings) at the LI interfaceswithin the design. Unfortunately, finding buggy combinationsof signal arrivals can require many iterations to expose bugsrequiring complex input patterns.Formal model checking [13] is commonly applied to morethoroughly verify various hardware components and proto-cols [14], [15]. Therefore, it is uniquely positioned to addressthe verification gap by proving the absence of the two classesof bugs in our designs without limitation to a specific subsetof stimulus and input arrival timing. In particular, we applyRTL-based formal model checking on the HLS-generated RTLdesign blocks to verify properties associated with these twoclasses of bugs. Figures 5 and 6 present an overview of thetwo corresponding formal models.
A. Invalid Input Consumption Check
Our invalid input consumption check proves whether validoutput data of the design are unaffected by invalid input data.In other words, invalid input data must not assert influenceon any valid output data. Recall the non-blocking adder inFigure 2(b) in which input data are read from non-blockingports
InA and
InB . If Line 9 in Figure 2(b) is not guardedby the conditional statement in Line 8, the adder will performaddition regardless of whether
DataA and
DataB containvalid input data. In this case, valid output at
Out is affectedby invalid inputs, and fails the property of being unaffectedby invalid input. While this hypothetical bug represents arelatively contrived case of incorrect bookkeeping logic, thereare many examples of improperly constrained non-blockingoperations that can be discovered by this check.Figure 5 shows how we wrap a design under test (DUT)into a formal model that checks for invalid input consumption.This model consists of a reference as well as a test instanceof the same DUT, with corresponding reference and testinputs and outputs as shown. On the input side, both DUTinstances are set up to always receive the same valid inputs,but may receive different invalid inputs, as modeled by themultiplexers. On the output side, the model verifies that anyvalid outputs are equivalent by comparing the correspondingoutput valid and output data signals.
V al
Ref == V al
T est and
V al
Ref & V al
T est = ⇒ D Ref == D T est define theconditions that constrain the inputs and check the outputs.Note that the model is set up such that both the referenceand test DUTs receive the same external ready signal. Producer ConsumerStaller StallerDeadlock?
GlobalStall
Producer
GlobalStall
Consumer
DataAck en_n en_n
Fig. 6.
Formal model for verifying deadlock-free property — Verifieswhether all modules are in the stalled state simultaneously.
B. Deadlock Detection
Our deadlock detection proves whether a design is absent ofdeadlock. Figure 7 shows a simple system with two interactingfunctional blocks that contains a potential deadlock. If cond in the
Consumer function initializes to false, consumer neveraccepts the data from producer, but instead immediately triesto push an acknowledgment to producer. However, producerwould stall during push because its data are not acceptedand cannot move on to popping the acknowledgment fromconsumer. As a result, both producer and consumer end upin a stalled state. When all interacting modules are stalled,no module can trigger forward progress, and the system is indeadlock. This represents one of many scenarios for which anLI design could deadlock.To detect a deadlock under any of these scenarios, wecan make use of global stall signals that a typical HLS toolgenerates for individual RTL modules. For instance, MentorCatapult HLS assigns a wait controller for each stallable inter-face (e.g., blocking port) of a module. These wait controllersthen communicate with the staller of the module to form theglobal stall signal for clock gating the module [16]. Figure 6provides an abstract illustration of the producer and consumermodules in RTL generated from the C++ descriptions inFigure 7. As shown in Figure 6, our verification wrapperconstructs the formal model by aggregating the global stallsignals from individual modules. During the formal check, ourmodel ensures that any input or output to the system is notblocked. The system contains a deadlock if the formal checkdetermines that it is possible for all aggregated global stallsignals to be asserted at the same time. If not, the system isfree of deadlock. The deadlock-free property for an N -moduledesign is formally expressed as ¬ (cid:16)(cid:86) Ni =1 GlobalStall i (cid:17) .We extend our deadlock detection method to support non-blocking ports, which do not include explicit stall signalsbecause they cannot be blocked by definition. In this case,we devise a custom global stall signal for each RTL modulefrom the ready signals of its non-blocking ports. This customglobal stall signal is asserted if none of the relevant readysignals have been asserted for N clock cycles, where N is aknown constant at verification time based on the HLS-appliedoptimizations. This type of custom global stall signal can be void Producer() { while (1) { Data.Push(1); Ack.Pop(); }} void Consumer() { while (1) { if (cond) { Data.Pop(); } Ack.Push(1); }} Fig. 7.
Simple system in C++ with two interacting functional blocks —Both blocks will stall if
Consumer does not accept data from
Producer .ABLE I R ESULTS OF FORMAL MODLE CHECKING ON REALISTIC ABSTRACTEDBUG CASES — R
UNTIMES ARE REPORTED IN SECONDS . Design Model Version
UnconstrainedInput InvalidInput Initial 0.1k 5.99 Falsified1st fix 0.1k 6.94 ProvenUnder-constrainedRead InvalidInput Initial 1.9k 14.14 Falsified1st fix 1.9k 18.24 Falsified2nd fix 1.9k 359.3 ProvenOut-of-order Push Deadlock Initial 0.8k 12.87 Falsified1st fix 0.8k 12.71 Falsified2nd fix 0.8k 32.08 ProvenCircularDependency Deadlock Initial 1.2k 13.51 Falsified1st fix 1.2k 17.68 ProvenMismatchedPipeline Depths Deadlock Initial 0.7k 8.24 Falsified1st fix 0.7k 18.47 Proven used in lieu of the global stall signal in Figure 6 to implementthe same deadlock check.V. E
XPERIMENTS
Our formal models are implemented using SystemVerilogassertions and verified with bounded model checking [17]using Synopsys VC Formal 2018.10 running on an Intel XeonCPU at 3GHz. Formal model checking is performed on theRTL synthesized from C++ using HLS tool. Because HLSis an automated process that compiles C++ into predictably-structured RTL well-suited for the extraction of relevant sig-nals outlined in Sections IV-A and IV-B, our formal flow canbe fully automated. Although the bugs we target originate inC++ during design entry, we formally verify the designs inRTL because the full scopes of the bugs only manifest underthe cycle-accurate timing model of RTL.We first validate our formal models using known bug cases,listed in Table I, abstracted from real industry and academicHLS designs. We abstract the design names to indicate the pri-mary cause of the bugs. Each of these bug cases consists of theinitial (buggy) version of the corresponding design followedby one or more patched (but possibly still buggy) versionsof the same design. The initial versions of all the designs arewritten without knowledge of our formal models. Likewise, themodels are developed without specific knowledge of the initialdesigns. As such, our abstracted bug cases provide a minimallyviable but faithful reproduction of the specific bugs to helpnarrow down the root causes of the bugs and to understandand validate the results of our formal models.Table I details the findings of the formal engine against ourproposed models for different versions of each design, alongwith post-logic-synthesis gate count and runtime of the formalengine. The evolution of each design from the initial version tothe final fix demonstrates the effectiveness and correctness ofour formal models. As shown in the table, our formal modelshave extracted bugs even from purportedly fixed and verifieddesigns, which speaks to the shortcoming of the existingverification methodology. The counterexamples provided bythese proofs are instrumental in quickly identifying the rootcause and devising the appropriate fix.To further demonstrate the applicability of our approach, weapply our formal models on a set of open-source HLS librarycomponents from MatchLib [12]. These library componentsare meant to be reused across a large number of designs, andtherefore constitute good candidates for extensive verification.
TABLE II R ESULTS OF FORMAL MODEL CHECKING ON
HLS
BENCHMARKS —R UNTIMES ARE REPORTED IN SECONDS . B
OUNDS ARE SHOWN UNDER R ESULTS /B OUND FOR NON - EXHAUSTIVE PROOFS . Design Model C o m pon e n t s A pp li ca ti on s NoCRouterArray.v1 Deadlock 32.0k 12.71 FalsifiedNoCRouterArray.v2 Deadlock 32.0k 16915 450OpticalFlow.v1 (cid:63)
Deadlock 321k + † Deadlock 321k + (cid:63) Deadlock 140k + (cid:5) Deadlock 140k + (cid:63) Original benchmark. † All HLS FIFO depths=1024. (cid:5)
Maximum HLSFIFO depth=4. + Gate count excludes 9kB-269kB of SRAMs.
We also experiment with full applications of our own and fromopen-source HLS benchmark suite Rosetta [18] to demonstratethe general applicability of our models. As shown in Table II,we apply the invalid input check on the five design componentsand successfully prove that they are unaffected by invalidinputs. On the other hand, we apply deadlock detection ona 2x2 array of network-on-chip (NoC) router components, anoptical flow accelerator, and a machine-learning acceleratorfor spam filtering, where we identify certain deadlock states.Results in Table II shows that we are able to prove exhaus-tively that the five components are unaffected by invalid input,and prove at the user-defined bound that specific versions ofthe applications are free of deadlock. Specifically, our dead-lock check supports the NoCRouterArray application whichmakes use of non-blocking ports exclusively. We discover anincorrect protocol constraint in NoCRouterArray.v1 that re-sults in a deadlock and apply the appropriate fix in NoCRouter-Array.v2 after examining the counterexample trace. We alsodetect a deadlock in OpticalFlow.v2 that escapes the testcases in the supplied test bench. Compared to OpticalFlow.v1,OpticalFlow.v2 is buggy because it contains one FIFO with areduced capacity, resulting in a potential cyclic wait scenariofrom insufficient FIFO capacity due to the re-convergencepattern in the benchmark’s dataflow [19]. SpamFilter.v2 doesnot incur the same problem even with reduced FIFO sizescompared to SpamFilter.v1 because the benchmark’s dataflowpattern contains no branches and thus no re-convergence.VI. C
ONCLUSIONS
While the LI design methodology simplifies the compositionof synthesized functional blocks in HLS, this latency-tolerantdesign style also introduces additional difficulty in ensuringthe correctness of high-level designs. In this paper, we pro-vide an automated flow based on formal model checking toguarantee that a high-level LI design is not affected by invalidinput data and is free of deadlock. The proposed verificationtechniques are effective and generally applicable to a range ofLI-based HLS designs, and lead to promising improvementin the quality of verification. We believe that closing theverification gap is key to mainstream adoption of HLS tools,and our formal verification extensions play a crucial role aspart of static sign-off toward this direction [20]. We expectwider adoption of our approach as formal tools and theirunderlying engines become increasingly scalable [21], [22].
EFERENCES[1] L. P. Carloni, K. L. McMillan, A. Saldanha, and A. L. Sangiovanni-Vincentelli, “A Methodology for Correct-by-construction Latency Insen-sitive Design,”
Int’l Conf. on Computer-Aided Design (ICCAD) , 2003.[2] L. P. Carloni, “From Latency-Insensitive Design to Communication-based System-Level Design,”
Proceedings of the IEEE , 2015.[3] R. Venkatesan, Y. S. Shao, B. Zimmer, J. Clemons, M. Fojtik, N. Jiang,B. Keller, A. Klinefelter, N. Pinckney, P. Raina et al. , “A 0.11 PJ/OP,0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Net-work Accelerator Designed with A High-Productivity VLSI Methodol-ogy,”
Symp. on High Performance Chips (Hot Chips) , 2019.[4] T. Ajayi, K. Al-Hawaj, A. Amarnath, S. Dai, S. Davidson, P. Gao, G. Liu,A. Lotfi, J. Puscar, A. Rao et al. , “Celerity: An Open Source RISC-V Tiered Accelerator Fabric,”
Symp. on High Performance Chips (HotChips) , 2017.[5] J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang,“High-Level Synthesis for FPGAs: From Prototyping to Deployment,”
IEEE Trans. on Computer-Aided Design of Integrated Circuits andSystems (TCAD) , 2011.[6] L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli,“Theory of Latency-Insensitive Design,”
IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems (TCAD) , 2001.[7] S. Suhaib, D. Mathaikutty, D. Berner, and S. Shukla, “Validating Fami-lies of Latency Insensitive Protocols,”
IEEE Transactions on Computers(TC) , 2006.[8] C.-H. Li, R. Collins, S. Sonalkar, and L. P. Carloni, “Design, Implemen-tation, and Validation of a New Class of Interface Circuits for Latency-Insensitive Design,”
Int’l Conf. on Formal Methods and Models forCodesign (MEMOCODE) , 2007.[9] C. Karfa, D. Sarkar, C. Mandal, and P. Kumar, “An Equivalence-Checking Method for Scheduling Verification in High-Level Synthesis,”
IEEE Trans. on Computer-Aided Design of Integrated Circuits andSystems (TCAD) , 2008.[10] Y. Kim, S. Kopuri, and N. Mansouri, “Automated Formal Verification ofScheduling Process using Finite State Machines with Datapath (FSMD),”
Int’l Symp. on Signals, Circuits and Systems (SCS) , 2004.[11] L. Piccolboni, G. Di Guglielmo, and L. P. Carloni, “KAIROS: Incre-mental Verification in High-Level Synthesis through Latency-InsensitiveDesign,”
Formal Methods in Computer Aided Design (FMCAD) , 2019.[12] B. Khailany, E. Krimer, R. Venkatesan, J. Clemons, J. S. Emer, M. Fo-jtik, A. Klinefelter, M. Pellauer, N. Pinckney, Y. S. Shao et al. , “AModular Digital VLSI Flow for High-Productivity SoC Design,”
DesignAutomation Conf. (DAC) , 2018.[13] E. M. Clarke Jr, O. Grumberg, D. Kroening, D. Peled, and H. Veith,
Model Checking . MIT Press, 2018.[14] B. Bingham, M. Greenstreet, and J. Bingham, “Parameterized Verifica-tion of Deadlock Freedom in Symmetric Cache Coherence Protocols,”
Formal Methods in Computer-Aided Design (FMCAD) , 2011.[15] D. Kaufmann, A. Biere, and M. Kauers, “Verifying Large Multipliers byCombining SAT and Computer Algebra,”
Formal Methods in ComputerAided Design (FMCAD) , 2019.[16]
Catapult Synthesis User and Reference Manual, Software Versionv10.5a , Mentor Graphics, 2020.[17] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu et al. ,“Bounded Model Checking,”
Advances in Computers , 2003.[18] Y. Zhou, U. Gupta, S. Dai, R. Zhao, N. Srivastava, H. Jin, J. Featherston,Y.-H. Lai, G. Liu, G. A. Velasquez et al. , “Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs,”
Int’l Symp. on Field-Programmable Gate Arrays (FPGA) , 2018.[19] M. Fingeroff,
High-Level Synthesis Blue Book . Xlibris Corporation,2010.[20] P. Ashar and V. Viswanath, “Closing the Verification Gap with StaticSign-off,”
Int’l Symp. on Quality Electronic Design (ISQED) , 2019.[21] M. Mann and C. Barrett, “Partial Order Reduction for Deep Bug Findingin Synchronous Hardware,”
Int’l Conf. on Tools and Algorithms for theConstruction and Analysis of Systems (TACAS) , 2020.[22] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovi´c, T. King,A. Reynolds, and C. Tinelli, “CVC4,”