[PDF] Verifying High-Level Latency-Insensitive Designs with Formal Model Checking

Abstract

Latency-insensitive design mitigates increasing interconnect delay and enables productive component reuse in complex digital systems. This design style has been adopted in high-level design flows because untimed functional blocks connected through latency-insensitive interfaces provide a natural communication abstraction. However, latency-insensitive design with high-level languages also introduces a unique set of verification challenges that jeopardize functional correctness. In particular, bugs due to invalid consumption of inputs and deadlocks can be difficult to detect and debug with dynamic simulation methods. To tackle these two classes of bugs, we propose formal model checking methods to guarantee that a high-level latency-insensitive design is unaffected by invalid input data and is free of deadlock. We develop a well-structured verification wrapper for each property to automatically construct the corresponding formal model for checking. Our experiments demonstrate that the formal checks are effective in realistic bug scenarios from high-level designs.

Full PDF

VVerifying High-Level Latency-Insensitive Designswith Formal Model Checking

Steve Dai, Alicia Klinefelter, Haoxing Ren, Rangharajan Venkatesan,Ben Keller, Nathaniel Pinckney, Brucek Khailany

NVIDIA

Abstract —Latency-insensitive design mitigates increasing in-terconnect delay and enables productive component reuse incomplex digital systems. This design style has been adoptedin high-level design ﬂows because untimed functional blocksconnected through latency-insensitive interfaces provide a nat-ural communication abstraction. However, latency-insensitivedesign with high-level languages also introduces a unique setof veriﬁcation challenges that jeopardize functional correctness.In particular, bugs due to invalid consumption of inputs anddeadlocks can be difﬁcult to detect and debug with dynamicsimulation methods. To tackle these two classes of bugs, wepropose formal model checking methods to guarantee that ahigh-level latency-insensitive design is unaffected by invalid inputdata and is free of deadlock. We develop a well-structuredveriﬁcation wrapper for each property to automatically constructthe corresponding formal model for checking. Our experimentsdemonstrate that the formal checks are effective in realistic bugscenarios from high-level designs.

I. I

NTRODUCTION

As modern SoC design challenges continue to motivatereuse of existing design blocks, latency-insensitive (LI) designhas emerged as a practical methodology for synchronizing pre-assembled modules under increasing pressure of lengtheninginterconnect delay [1], [2]. By exposing a valid-ready interfacefrom each module, LI design decouples the timing of intra-module computation from that of inter-module communica-tion to ensure robust functionality while tolerating arbitrarycommunication latency between modules. This methodologyenables ﬂexible physical design implementation without im-pacting veriﬁcation of individual components.In parallel with this trend, hardware designers have em-braced high-level languages for high-productivity VLSI de-sign. In particular, high-level synthesis (HLS) compilers canautomatically synthesize RTL from C++ models. Because anHLS compiler translates untimed functional blocks in softwareinto interconnected cycle-accurate hardware modules with cus-tomized throughput and latency, it is natural for HLS to adoptan LI-based composition of the modules to take advantage ofthe modularity and relaxed timing requirement of LI design.The conﬂuence of HLS and LI design has enabled rapid designof large-scale chips using high-level languages [3], [4].Figure 1 shows a typical HLS design ﬂow (on the left) forgenerating an RTL model from a C++ description. Designersgenerally rely on dynamic simulation of C++ and RTL modelsfor verifying their designs. The HLS design ﬂow allowsdesigners to leverage C++ simulation for the bulk of theirdesign veriﬁcation tasks and promises orders of magnitudespeedup over a conventional RTL ﬂow [5]. Nevertheless, dueto inherent differences in timing models between C++ andRTL, C++ simulation must be complemented by RTL simula-tion to expose bugs that require cycle-accurate introspection. C ++ S i m u l a t i on C ++ S ou r c e V e r i f i ed C ++ H L S C o m p il a t i on H L S - gene r a t ed R T L R T L S i m u l a t i on V e r i f i ed R T L Invalid Consumptionand/or Deadlock?Formal Model CheckingWrapped RTL

Automated

Wrapper

Our Formal Verification Extensions

Fig. 1.

Typical HLS design and veriﬁcation ﬂow with our proposed formalextensions — Design is typically veriﬁed with dynamic simulations in C++and RTL. We extend veriﬁcation with automated formal model checking.

The quality of dynamic simulation depends heavily onthe effectiveness of the set of chosen stimuli. These stimulirealistically represent only a small subspace of all possibleinputs and risk excluding difﬁcult-to-anticipate corner cases.In addition, the LI property allows for arbitrary timing of inter-module interface, further requiring expansion of the veriﬁca-tion coverage space to include a full range of input arrivaltimes and relative input ordering. With dynamic simulation,it is especially difﬁcult to detect and eliminate design issuesresulting in invalid consumption of inputs and deadlock, whichconstitute two commonly occurring bugs in LI-based HLSdesign. To address these two classes of bugs, we proposeaugmenting a typical HLS design ﬂow with formal veriﬁca-tion extensions (right side of Figure 1 in bold). The majorcontributions of this work are: • We propose RTL formal model checking methods thatguarantee an LI-based HLS design is not affected by invalidinput data and is free of deadlock for all intended use cases. • We develop an automated ﬂow to generate the veriﬁcationwrapper for each property and implement the correspondingformal checks without human intervention. • We demonstrate the effectiveness of the proposed veriﬁca-tion models and ﬂow in realistic industry and academic bugscenarios, as well as on a range of design blocks.The remainder of the paper is organized as follows: Sec-tion II reviews related work; Section III establishes the basicsfor LI design in HLS; Section IV describes the two classes ofcommon bugs we target and proposes two formal models todetect these bugs; Section V demonstrates the effectiveness ofour models, followed by conclusions in Section VI.II. R

ELATED W ORK

LI design refers to the correct-by-construction composi-tion of stallable computational processes that exchange datathrough communication channels in accordance with an ap-propriate LI protocol such as valid/ready or request/acknowl-edge [1], [6]. As long as each computational process (i.e.,functional module) is implemented correctly, the compositionwill also behave correctly regardless of the latency of thechannels. The introduction of the methodology has led to a r X i v : . [ c s . L O ] F e b void adder blocking() { while (1) { DataA = InA.Pop(); DataB = InB.Pop(); Out.Push(DataA+DataB); }} (a) Fig. 2.

Simple adder designin C++ — (a) Input ports areblocking. (b) Input ports arenon-blocking. void adder nonblocking() { statusA = false ; statusB = false ; while (1) { if (!statusA) statusA = InA.PopNB(DataA); if (!statusB) statusB = InB.PopNB(DataB); if (statusA && statusB) { Out.Push(DataA+DataB); statusA = false ; statusB = false ; }}} (b) the emergence of a family of LI protocols, followed by aset of dynamic simulation and formal veriﬁcation techniquesto validate the correctness of these protocols [7], [8]. Ourwork is concerned with verifying designs implemented usingLI methodology, rather than validating the correctness of themethodology and protocols as in these previous works.Designers typically verify LI-based HLS designs with dy-namic simulation in C++ and RTL as they iterate on variousdesign changes. Conventionally, formal veriﬁcation methodshave been applied to prove the correctness of C-to-RTL trans-formations by checking the equivalence of the model beforeand after transformation, rather than validating the correctnessof the C++ implementation itself [9], [10]. KAIROS leveragesformal equivalence checking to verify whether a modiﬁed HLSdesign is equivalent to the original (golden) design after incre-mental code modiﬁcation or change in HLS optimizations [11].Unlike previous formal techniques for HLS, our methods donot target bugs caused by the transformations of the HLS tool.Instead, we target designer-induced bugs while not requiringa golden model as ground truth.III. LI I MPLEMENTATION IN

HLSIn this work, LI-based HLS designs are realized usingMatchLib, a high-level library of synthesizable port and chan-nel primitives in C++ implementing LI valid-ready protocolsfor HLS tools [12]. For an LI design in HLS, each functionalblock exposes an interface of directional ports from the Match-Lib library, shown as

InA , InB , and

Out in Figure 2(a). Theseports are then connected to MatchLib ports on other functionalblocks via MatchLib channels. As shown in Figure 2(a), eachport can read (pop) data from a channel (e.g.,

InA and

InB )or write (push) data to a channel (e.g.,

Out ).Transactions acting on a MatchLib port can be eitherblocking or non-blocking. Blocking communication preventssubsequent transactions from executing until the current reador write has succeeded, and can block forward progress.In contrast, non-blocking communication allows subsequenttransactions to execute regardless of whether the current reador write is successful, and cannot block forward progress. Fig-ure 2(b) implements the same adder design as in Figure 2(a),except with non-blocking input ports

InA and

InB . statusA and statusB in (b) are used to indicate whether DataA and

DataB respectively contain valid input data from the channels.Because of the use of a blocking pop in (a), Line 4 cannotbe executed until the

Pop in Line 3 successfully reads validdata from port

InA into local variable

DataA . However, withnon-blocking pop in (b), Line 6 can be executed after Line5 regardless of whether the

PopNB in Line 5 is successful. input [31:0] InA msg; input InA val; output InA rdy; output [31:0] Out msg; output Out val; input Out rdy;

Fig. 3.

Interface of simple adder design in Verilog after HLS —Valid/ready input port signals and output port signals, respectively.

The corresponding valid-ready interfaces in RTL for input port

InA and output port

Out generated from the C++ descriptionsin Figure 2 are shown in Figure 3, respectively. Blocking andnon-blocking ports share the same RTL interface but differ intheir internal blocking logic.It is important for an LI implementation to include non-blocking communication so that realistic and scalable designscan be expressed. Figure 4 illustrates a three-input one-outputprocess p . If p implements a multiplier, all three inputs mustbe available before the output is valid; however, if p is anarbiter, only one input needs to be available before producinga valid output. The multiplier could be implemented withblocking communication, while the arbiter must include non-blocking reads to be functional.While the decision to use blocking versus non-blockingcommunication is dependent on internal process details, eachcomes with its own pitfalls that result in the two classes of bugstargeted by this paper. Blocking communication creates aninherent wait-for relationship between design blocks and canlead to deadlock. Non-blocking communication requires cus-tom bookkeeping logic (e.g., Lines 4, 6, and 8 in Figure 2(b))to prevent undesirable consumption of invalid input data thatcan increase the chance of designer error. Mixing blocking andnon-blocking communication further complicates the design.Our work provides safeguards from these kinds of unintendedand undesirable consequences through automation, withoutextra burden on the HLS designer or veriﬁcation engineer.IV. F ORMAL M ODEL C HECKING FOR

LI D

ESIGN

The ﬁrst class of bugs we target involves consumptionof invalid inputs at the LI interface. When non-blockingcommunication is needed, a user typically writes custom book-keeping logic to manage the communication. This is a commoncause of errors, since improperly constrained non-blockingreads and writes can result in tainted updates to statefulelements by invalid input data. The risk of error increases withadditional non-blocking ports as designers attempt to managethe complex interaction among multiple instances of custombookkeeping logic while keeping track of how the sequentialC++ design entry will translate into parallel hardware ingenerated RTL. This class of bugs is difﬁcult to detect withdynamic simulation because the designer-imposed constraintsare often buggy only under limited corner cases that are non-trivial to conceive ahead of time during test planning.The second class of bugs we target involves deadlock, whichmay arise due to a multitude of factors, including incorrectcapacities for communication channels, improper applicationor combination of blocking and non-blocking ports, latency-sensitive bookkeeping logic, or circular dependencies among op op I O O O (b) I I (a) p I O I I p Fig. 4.

Dataﬂow for a three-input one-output process — Internal details of pro-cess p , in addition to the dataﬂow model,determines whether input ports should beblocking or non-blocking. Ref. DUTTest

DUT D InARef D InBRef D InATest

Val

InA

Val B Val

OutRef D OutRef

Val

OutTest D OutTest ====

InBTest

Fig. 5.

Formal model for verifying invalid input insensitive property —Veriﬁes if valid outputs are equivalent even under different invalid inputs. different blocks. It is difﬁcult for designers to be completelymindful of all sources of deadlock during the design process,especially when the design entry is untimed and sequentialC++. Exposing potential deadlock scenarios with dynamicsimulation requires stressing the range of input arrival times(and therefore relative input orderings) at the LI interfaceswithin the design. Unfortunately, ﬁnding buggy combinationsof signal arrivals can require many iterations to expose bugsrequiring complex input patterns.Formal model checking [13] is commonly applied to morethoroughly verify various hardware components and proto-cols [14], [15]. Therefore, it is uniquely positioned to addressthe veriﬁcation gap by proving the absence of the two classesof bugs in our designs without limitation to a speciﬁc subsetof stimulus and input arrival timing. In particular, we applyRTL-based formal model checking on the HLS-generated RTLdesign blocks to verify properties associated with these twoclasses of bugs. Figures 5 and 6 present an overview of thetwo corresponding formal models.

A. Invalid Input Consumption Check

Our invalid input consumption check proves whether validoutput data of the design are unaffected by invalid input data.In other words, invalid input data must not assert inﬂuenceon any valid output data. Recall the non-blocking adder inFigure 2(b) in which input data are read from non-blockingports

InA and

InB . If Line 9 in Figure 2(b) is not guardedby the conditional statement in Line 8, the adder will performaddition regardless of whether

DataA and

DataB containvalid input data. In this case, valid output at

Out is affectedby invalid inputs, and fails the property of being unaffectedby invalid input. While this hypothetical bug represents arelatively contrived case of incorrect bookkeeping logic, thereare many examples of improperly constrained non-blockingoperations that can be discovered by this check.Figure 5 shows how we wrap a design under test (DUT)into a formal model that checks for invalid input consumption.This model consists of a reference as well as a test instanceof the same DUT, with corresponding reference and testinputs and outputs as shown. On the input side, both DUTinstances are set up to always receive the same valid inputs,but may receive different invalid inputs, as modeled by themultiplexers. On the output side, the model veriﬁes that anyvalid outputs are equivalent by comparing the correspondingoutput valid and output data signals.

V al

Ref == V al

T est and

V al

Ref & V al

T est = ⇒ D Ref == D T est deﬁne theconditions that constrain the inputs and check the outputs.Note that the model is set up such that both the referenceand test DUTs receive the same external ready signal. Producer ConsumerStaller StallerDeadlock?

GlobalStall

Producer

GlobalStall

Consumer

DataAck en_n en_n

Fig. 6.

Formal model for verifying deadlock-free property — Veriﬁeswhether all modules are in the stalled state simultaneously.

B. Deadlock Detection

Our deadlock detection proves whether a design is absent ofdeadlock. Figure 7 shows a simple system with two interactingfunctional blocks that contains a potential deadlock. If cond in the

Consumer function initializes to false, consumer neveraccepts the data from producer, but instead immediately triesto push an acknowledgment to producer. However, producerwould stall during push because its data are not acceptedand cannot move on to popping the acknowledgment fromconsumer. As a result, both producer and consumer end upin a stalled state. When all interacting modules are stalled,no module can trigger forward progress, and the system is indeadlock. This represents one of many scenarios for which anLI design could deadlock.To detect a deadlock under any of these scenarios, wecan make use of global stall signals that a typical HLS toolgenerates for individual RTL modules. For instance, MentorCatapult HLS assigns a wait controller for each stallable inter-face (e.g., blocking port) of a module. These wait controllersthen communicate with the staller of the module to form theglobal stall signal for clock gating the module [16]. Figure 6provides an abstract illustration of the producer and consumermodules in RTL generated from the C++ descriptions inFigure 7. As shown in Figure 6, our veriﬁcation wrapperconstructs the formal model by aggregating the global stallsignals from individual modules. During the formal check, ourmodel ensures that any input or output to the system is notblocked. The system contains a deadlock if the formal checkdetermines that it is possible for all aggregated global stallsignals to be asserted at the same time. If not, the system isfree of deadlock. The deadlock-free property for an N -moduledesign is formally expressed as ¬ (cid:16)(cid:86) Ni =1 GlobalStall i (cid:17) .We extend our deadlock detection method to support non-blocking ports, which do not include explicit stall signalsbecause they cannot be blocked by deﬁnition. In this case,we devise a custom global stall signal for each RTL modulefrom the ready signals of its non-blocking ports. This customglobal stall signal is asserted if none of the relevant readysignals have been asserted for N clock cycles, where N is aknown constant at veriﬁcation time based on the HLS-appliedoptimizations. This type of custom global stall signal can be void Producer() { while (1) { Data.Push(1); Ack.Pop(); }} void Consumer() { while (1) { if (cond) { Data.Pop(); } Ack.Push(1); }} Fig. 7.

Simple system in C++ with two interacting functional blocks —Both blocks will stall if

Consumer does not accept data from

Producer .ABLE I R ESULTS OF FORMAL MODLE CHECKING ON REALISTIC ABSTRACTEDBUG CASES — R

UNTIMES ARE REPORTED IN SECONDS . Design Model Version

UnconstrainedInput InvalidInput Initial 0.1k 5.99 Falsiﬁed1st ﬁx 0.1k 6.94 ProvenUnder-constrainedRead InvalidInput Initial 1.9k 14.14 Falsiﬁed1st ﬁx 1.9k 18.24 Falsiﬁed2nd ﬁx 1.9k 359.3 ProvenOut-of-order Push Deadlock Initial 0.8k 12.87 Falsiﬁed1st ﬁx 0.8k 12.71 Falsiﬁed2nd ﬁx 0.8k 32.08 ProvenCircularDependency Deadlock Initial 1.2k 13.51 Falsiﬁed1st ﬁx 1.2k 17.68 ProvenMismatchedPipeline Depths Deadlock Initial 0.7k 8.24 Falsiﬁed1st ﬁx 0.7k 18.47 Proven used in lieu of the global stall signal in Figure 6 to implementthe same deadlock check.V. E

XPERIMENTS

Our formal models are implemented using SystemVerilogassertions and veriﬁed with bounded model checking [17]using Synopsys VC Formal 2018.10 running on an Intel XeonCPU at 3GHz. Formal model checking is performed on theRTL synthesized from C++ using HLS tool. Because HLSis an automated process that compiles C++ into predictably-structured RTL well-suited for the extraction of relevant sig-nals outlined in Sections IV-A and IV-B, our formal ﬂow canbe fully automated. Although the bugs we target originate inC++ during design entry, we formally verify the designs inRTL because the full scopes of the bugs only manifest underthe cycle-accurate timing model of RTL.We ﬁrst validate our formal models using known bug cases,listed in Table I, abstracted from real industry and academicHLS designs. We abstract the design names to indicate the pri-mary cause of the bugs. Each of these bug cases consists of theinitial (buggy) version of the corresponding design followedby one or more patched (but possibly still buggy) versionsof the same design. The initial versions of all the designs arewritten without knowledge of our formal models. Likewise, themodels are developed without speciﬁc knowledge of the initialdesigns. As such, our abstracted bug cases provide a minimallyviable but faithful reproduction of the speciﬁc bugs to helpnarrow down the root causes of the bugs and to understandand validate the results of our formal models.Table I details the ﬁndings of the formal engine against ourproposed models for different versions of each design, alongwith post-logic-synthesis gate count and runtime of the formalengine. The evolution of each design from the initial version tothe ﬁnal ﬁx demonstrates the effectiveness and correctness ofour formal models. As shown in the table, our formal modelshave extracted bugs even from purportedly ﬁxed and veriﬁeddesigns, which speaks to the shortcoming of the existingveriﬁcation methodology. The counterexamples provided bythese proofs are instrumental in quickly identifying the rootcause and devising the appropriate ﬁx.To further demonstrate the applicability of our approach, weapply our formal models on a set of open-source HLS librarycomponents from MatchLib [12]. These library componentsare meant to be reused across a large number of designs, andtherefore constitute good candidates for extensive veriﬁcation.

TABLE II R ESULTS OF FORMAL MODEL CHECKING ON

HLS

BENCHMARKS —R UNTIMES ARE REPORTED IN SECONDS . B

OUNDS ARE SHOWN UNDER R ESULTS /B OUND FOR NON - EXHAUSTIVE PROOFS . Design Model C o m pon e n t s A pp li ca ti on s NoCRouterArray.v1 Deadlock 32.0k 12.71 FalsiﬁedNoCRouterArray.v2 Deadlock 32.0k 16915 450OpticalFlow.v1 (cid:63)

Deadlock 321k + † Deadlock 321k + (cid:63) Deadlock 140k + (cid:5) Deadlock 140k + (cid:63) Original benchmark. † All HLS FIFO depths=1024. (cid:5)

Maximum HLSFIFO depth=4. + Gate count excludes 9kB-269kB of SRAMs.

We also experiment with full applications of our own and fromopen-source HLS benchmark suite Rosetta [18] to demonstratethe general applicability of our models. As shown in Table II,we apply the invalid input check on the ﬁve design componentsand successfully prove that they are unaffected by invalidinputs. On the other hand, we apply deadlock detection ona 2x2 array of network-on-chip (NoC) router components, anoptical ﬂow accelerator, and a machine-learning acceleratorfor spam ﬁltering, where we identify certain deadlock states.Results in Table II shows that we are able to prove exhaus-tively that the ﬁve components are unaffected by invalid input,and prove at the user-deﬁned bound that speciﬁc versions ofthe applications are free of deadlock. Speciﬁcally, our dead-lock check supports the NoCRouterArray application whichmakes use of non-blocking ports exclusively. We discover anincorrect protocol constraint in NoCRouterArray.v1 that re-sults in a deadlock and apply the appropriate ﬁx in NoCRouter-Array.v2 after examining the counterexample trace. We alsodetect a deadlock in OpticalFlow.v2 that escapes the testcases in the supplied test bench. Compared to OpticalFlow.v1,OpticalFlow.v2 is buggy because it contains one FIFO with areduced capacity, resulting in a potential cyclic wait scenariofrom insufﬁcient FIFO capacity due to the re-convergencepattern in the benchmark’s dataﬂow [19]. SpamFilter.v2 doesnot incur the same problem even with reduced FIFO sizescompared to SpamFilter.v1 because the benchmark’s dataﬂowpattern contains no branches and thus no re-convergence.VI. C

ONCLUSIONS

While the LI design methodology simpliﬁes the compositionof synthesized functional blocks in HLS, this latency-tolerantdesign style also introduces additional difﬁculty in ensuringthe correctness of high-level designs. In this paper, we pro-vide an automated ﬂow based on formal model checking toguarantee that a high-level LI design is not affected by invalidinput data and is free of deadlock. The proposed veriﬁcationtechniques are effective and generally applicable to a range ofLI-based HLS designs, and lead to promising improvementin the quality of veriﬁcation. We believe that closing theveriﬁcation gap is key to mainstream adoption of HLS tools,and our formal veriﬁcation extensions play a crucial role aspart of static sign-off toward this direction [20]. We expectwider adoption of our approach as formal tools and theirunderlying engines become increasingly scalable [21], [22].

EFERENCES[1] L. P. Carloni, K. L. McMillan, A. Saldanha, and A. L. Sangiovanni-Vincentelli, “A Methodology for Correct-by-construction Latency Insen-sitive Design,”

Int’l Conf. on Computer-Aided Design (ICCAD) , 2003.[2] L. P. Carloni, “From Latency-Insensitive Design to Communication-based System-Level Design,”

Proceedings of the IEEE , 2015.[3] R. Venkatesan, Y. S. Shao, B. Zimmer, J. Clemons, M. Fojtik, N. Jiang,B. Keller, A. Klinefelter, N. Pinckney, P. Raina et al. , “A 0.11 PJ/OP,0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Net-work Accelerator Designed with A High-Productivity VLSI Methodol-ogy,”

Symp. on High Performance Chips (Hot Chips) , 2019.[4] T. Ajayi, K. Al-Hawaj, A. Amarnath, S. Dai, S. Davidson, P. Gao, G. Liu,A. Lotﬁ, J. Puscar, A. Rao et al. , “Celerity: An Open Source RISC-V Tiered Accelerator Fabric,”

Symp. on High Performance Chips (HotChips) , 2017.[5] J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang,“High-Level Synthesis for FPGAs: From Prototyping to Deployment,”

IEEE Trans. on Computer-Aided Design of Integrated Circuits andSystems (TCAD) , 2011.[6] L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli,“Theory of Latency-Insensitive Design,”

IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems (TCAD) , 2001.[7] S. Suhaib, D. Mathaikutty, D. Berner, and S. Shukla, “Validating Fami-lies of Latency Insensitive Protocols,”

IEEE Transactions on Computers(TC) , 2006.[8] C.-H. Li, R. Collins, S. Sonalkar, and L. P. Carloni, “Design, Implemen-tation, and Validation of a New Class of Interface Circuits for Latency-Insensitive Design,”

Int’l Conf. on Formal Methods and Models forCodesign (MEMOCODE) , 2007.[9] C. Karfa, D. Sarkar, C. Mandal, and P. Kumar, “An Equivalence-Checking Method for Scheduling Veriﬁcation in High-Level Synthesis,”

IEEE Trans. on Computer-Aided Design of Integrated Circuits andSystems (TCAD) , 2008.[10] Y. Kim, S. Kopuri, and N. Mansouri, “Automated Formal Veriﬁcation ofScheduling Process using Finite State Machines with Datapath (FSMD),”

Int’l Symp. on Signals, Circuits and Systems (SCS) , 2004.[11] L. Piccolboni, G. Di Guglielmo, and L. P. Carloni, “KAIROS: Incre-mental Veriﬁcation in High-Level Synthesis through Latency-InsensitiveDesign,”

Formal Methods in Computer Aided Design (FMCAD) , 2019.[12] B. Khailany, E. Krimer, R. Venkatesan, J. Clemons, J. S. Emer, M. Fo-jtik, A. Klinefelter, M. Pellauer, N. Pinckney, Y. S. Shao et al. , “AModular Digital VLSI Flow for High-Productivity SoC Design,”

DesignAutomation Conf. (DAC) , 2018.[13] E. M. Clarke Jr, O. Grumberg, D. Kroening, D. Peled, and H. Veith,

Model Checking . MIT Press, 2018.[14] B. Bingham, M. Greenstreet, and J. Bingham, “Parameterized Veriﬁca-tion of Deadlock Freedom in Symmetric Cache Coherence Protocols,”

Formal Methods in Computer-Aided Design (FMCAD) , 2011.[15] D. Kaufmann, A. Biere, and M. Kauers, “Verifying Large Multipliers byCombining SAT and Computer Algebra,”

Formal Methods in ComputerAided Design (FMCAD) , 2019.[16]

Catapult Synthesis User and Reference Manual, Software Versionv10.5a , Mentor Graphics, 2020.[17] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu et al. ,“Bounded Model Checking,”

Advances in Computers , 2003.[18] Y. Zhou, U. Gupta, S. Dai, R. Zhao, N. Srivastava, H. Jin, J. Featherston,Y.-H. Lai, G. Liu, G. A. Velasquez et al. , “Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs,”

Int’l Symp. on Field-Programmable Gate Arrays (FPGA) , 2018.[19] M. Fingeroff,

High-Level Synthesis Blue Book . Xlibris Corporation,2010.[20] P. Ashar and V. Viswanath, “Closing the Veriﬁcation Gap with StaticSign-off,”

Int’l Symp. on Quality Electronic Design (ISQED) , 2019.[21] M. Mann and C. Barrett, “Partial Order Reduction for Deep Bug Findingin Synchronous Hardware,”

Int’l Conf. on Tools and Algorithms for theConstruction and Analysis of Systems (TACAS) , 2020.[22] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovi´c, T. King,A. Reynolds, and C. Tinelli, “CVC4,”