[PDF] Automatic Verification of LLVM Code

Abstract

In this work we present our work in developing a software verification tool for llvm-code - Lodin - that incorporates both explicit-state model checking, statistical model checking and symbolic state model checking algorithms.

Full PDF

aa r X i v : . [ c s . P L ] J un Automatic Veriﬁcation of LLVM Code

Axel Legay , Dirk Nowotka , and Danny Bgsted Poulsen UCLouvain, Belgium Kiel University, Germany Aalborg University, DenmarkJune 5, 2020

Abstract

In this work we present our work in developing asoftware veriﬁcation tool for

LLVM -code -

Lodin -that incorporates both explicit-state model check-ing, statistical model checking and symbolic statemodel checking algorithms.

Formal Methods, in particular Model Checking [1],have for many years promised to revolutionise theway we assert software correctness. It has gained alarge following in the hardware design industry, buthas yet to become mainstream in the software de-velopment industry - and this despite software beingused in a large array of safety-critical components ine.g. cars and air planes. Nowadays, any non-trivialcomponent of any system is controlled by an em-bedded microprocessor with a control program mak-ing software quality assurance more important thanever. Many case studies have shown that formalmethods is a valuable tool - even in industrial con-texts - but most successful applications have beenconducted by academic researchers exploring formalmethods usefulness.One of the reasons that formal methods havenot penetrated the software industry is, that formalmethods require a translation of the source code toa formal model (e.g. Petri Nets or Automata) and the analysis conducted on these formal models. Thisis problematic as it requires industry engineers toinvest quite some eﬀort into understanding the for-mal modelling language and its associated tool. Thediagnostic output for formal tools are also hard tounderstand without being an expert in formal meth-ods. As a result, industry quality assurance relieson extensive testing - which will have to be doneeven after applying formal methods - and code re-views. Another complicating factor in applying theabove mentioned workﬂow is, that sometimes theengineers do not know the source code intimately -parts of it might have been auto-generated and someof it might be legacy code. Attempting to translatecode one has not developed to a formal model is verydiﬃcult and error-prone.In summary, the learning curve of formal meth-ods is steep thus industry engineers rely on othermethods, and translating code to formal models isvery hard and close to impossible. Formal tools areneeded that understand the source code that indus-try already uses to easen the usage of formal toolsin industry.Academics have developed tools accepting purecode as inputs [2, 5, 13, 14]. A major breakthroughwas achieved by tools such as

Blast [5] and

SLAM [2] based around a Counter-Example-Guided-Abstraction-Reﬁnement (CEGAR) [9],where a program text is explored symbolic basedon a predicate abstraction of the program. The1redicates are continuously reﬁned to make the ab-straction as detailed as needed. Another approach,pioneered by the tool

CBMC [16], is boundedmodel checking [6]. Here the program transitionsystem is unrolled a number of times ( in practiceby unrolling loops and inlining function call), andencoded into a constraint system. During encodingthe assertions can be added that has to be truealong any execution (e.g. that a divisor is neverzero). If the resulting constraint system has asolution where an assertion is true, then the systemis not safe. CEGAR and Bounded Model Checkingare incomplete, but are nevertheless both verysuccessful in locating errors.Nowadays the more successful software veriﬁca-tion tools are

CBMC [16] (bounded model checker)and

CPAChecker [4] (CEGAR-based tool - anddirect successor of

Blast ). The tools are amongthe dominating tools in Software Veriﬁcation com-petitions . CBMC and

CPAChecker are both tied to onesource language thus major parts of the tools haveto be implemented for each language they want tosupport. A better idea may be to base the analy-ses on an intermediate format that can capture thesemantics of many high level languages. One suchintermediate format is

LLVM [17] which at least 4tools are using:1.

LLBMC [13] follows in the footsteps of

CBMC and performs bounded model checking on

LLVM ,2.

SeaHorn [15] has the objective of making ver-iﬁcation platform for

LLVM code, it seems toemploy mostly CEGAR-based approaches,3.

Klee [7] is a symbolic execution engine per-forming a s symbolic exploration of the statespace, in order to ﬁnd good test cases for test-ing, and4.

Divine [3] is an explicit-state model checker for

LLVM code.Although previously mentioned tools have pavedthe way for formal methods entering industry, they https://sv-comp.sosy-lab.org are not without ﬂaws. A lot of them primarily focuson single-threaded programs which is a problem, be-cause industry moves to multi core-architecture andveriﬁcation thus needs to take interleaving into ac-count. This interleaving is the cause of the statespace space explosion problem - a problem that thesymbolic representation of LLBMC , CBMC and

CPAChecker cannot avoid. Although there hasbeen some work in adapting at least

CBMC to con-current code, it is still an open problem how to verifyconcurrent programs eﬃciently.In this paper we present the tool

Lodin a fairlynew tool [18] oﬀering a range of veriﬁcation tech-niques for

LLVM . For concurrent programs it im-plements explicit-state reachability. Realising an ex-haustive state space search will not scale for largeprograms, it also implements under-approximatestate space searches through simulation. For single-threaded programs

Lodin implements symbolic ex-ploration akin to

CBMC and

LLBMC . In this way,

Lodin distinguishes itself from existing tools by im-plementing several techniques into a joint frame-work.

Lodin achieves its ability to implement diﬀerenttechniques through its ﬂexible architecture. An-other feature of

Lodin that sets it apart from otherformal tools is its extensibility through platform plu-gins: the core of

Lodin implements only the bareminimum semantics of

LLVM and has no knowl-edge of the runtime environment of the program. Inreal-life programs, the executing program may callinto the runtime environment which

Lodin mustknow about in order to provide correct veriﬁcationresults. The platform plugins serves as a way toprovide these implementations.

Although the focus of this paper is not to describethe

LLVM [17] language itself, we spend some timeon presenting a simpliﬁed version of the

LLVM in-struction set and its semantics. The full

LLVM language description is available online [12]. Thedescription we provide is closely linked to the im-plementation inside

Lodin .2 ; Function Attrs : nounwind uwtable define void @main () init: br label %blk blk: %x = phi i32 [ %z, %blk ], [ 0, %init ] %z = phi i32 [ %x, %blk ], [ 1, %init ] %b = icmp eq i32 %x, %z br i1 %b, label %succ , label %blk succ: %y = add i32 0, 1 ret i32 1 } LLVM -Listing 1: An example

LLVM module witha single entry point @ main . An LLVM module consists of functions of whichsome of them may be entry point functions whichare starting points for an

LLVM process. Func-tions are divided into

Basic Blocks where a BasicBlock is a sequence of instructions executed in alinear fashion. Basic blocks are named by labels,so that instructions can direct control to the basicblock. Individual instructions within a basic blockcan be pure artihmetic operations, memory alloca-tions, memory accesses, function calls or instruc-tions that passes control to other basic blocks. Basicblocks are always terminated by the latter class thusthese are called terminator instructions. Operandsto the instructions of an

LLVM program are keptin so-called registers, and a syntactical requirementfor an

LLVM is that it must be in single-static-assignment i.e. each register is only assigned once.In

LLVM -Listing 1 is shown a very short

LLVM program. The program consists of a single function @ main (which is also the entry point) that consistsof three basic blocks init , blk and succ . The blockscovers lines 4 −

5, 7 −

10 and 12 −

13 respectively.The terminating instruction links init to block blk and links blk to succ and blk . We refer to Figure 1for a graphical depiction of how the basic blocks arelinked together. LLVM Types

All operations in

LLVM are typed,either with an arbitrary width bitvector, a com- init : br label % blkblk :% x = phi i32 [ % z , % blk ] [ , % init ] % z = phi i32 [ % x , % branch ] [ , % init ] % b = cmp eq i % x , % zbr i % b , label % succ , label % blksucc :% y = add i32 0 , Figure 1: Control Flow Graph of

LLVM -Listing 1.pound datatype or a memory pointer. The bitvec-tor is denoted i n where n is the width. For ourdiscussion, we restrict ourselves to bitvectors thatare multiple of bytes thus we let T int = { i n | n ∈ { , , , , . . . , } } be the set of all integer types in LLVM . .If ty , . . . , ty n are LLVM types then h ty , . . . ty n − i is acompound type. We denote by T comp all compound LLVM types. For a type h ty , . . . , ty n − i and se-quence of integers i , . . . , i k we let T i ,...,i k ( h ty , . . . , ty n − i ) = T i ,...,i k ( ty i1 ) T ǫ ( ty ) = ty , A memory pointer type to a type ty is denoted ty ∗ . LLVM leaves the bithwidth of pointer types unspec-iﬁed - for the remainder of this paper we assume it is64 bit. As is customary in C-style languages,

LLVM includes the void type used to signify a function doesnot return a value.It will often be convenient to talk about the byte- Like C-Style structs

BSize ( ty ) =  n if ty = i n P ni =1 BSize ( ty i ) if ty = h ty , . . . , ty n i ty = i n ∗ We let T denote the set of all types in LLVM . LLVM instructions

Let R be a set of registers, BL be a ﬁnite set of basic block labels and let Fs be a ﬁnite set of function names, then Table 1 dis-plays the instruction set used in our discussion of LLVM . In the table

BInst ( R ) = Arith ( R ) ∪ Log ( R ) ∪ Mem ( R ) ∪ Cmp ( R ) ∪ Intrin ( R ) are the basic instruc-tions while Term ( R , BL ) are instructions terminatinga basic blocks (e.g. jumps). A short descriptionof the intendend meaning of the instruction classesmay be in order: Arith ( R ) Instructions in this class are arithmeticinstructions that takes two registers ( % inp1 and % inp2 , perform the mathematical operation andstore the result in % res . It is worth noting thatsince LLVM has no signed and unsigned typesit instead has signed and unsigned versions ofsome instructions. Prime examples of this isthe remainder ( rem ) and the division ( div ) in-structions. Signed and unsigned versions aredistinguished by the preﬁxes ’s’ and ’u’.

Log ( R ) This class consists of instructions perform-ing bitwise operations. It might be worth men-tioning the bit shift operations. Shifting to theleft, shl , is performed by moving the bit patterntowards the most signiﬁcant bit and pad withzeros. For Shifting to the right, LLVM has tooperations lshr and ashr . The lshr is similar toleft shifting with the diﬀerence that the patternis shifted to the least signifant bit and called alogical shift. The ashr is on the other hand aarithmetic right shift, which preserves the signbit of the pattern.

Mem ( R ) This instructions class has instructions forallocating memory, loading a value from a mem-ory address and a value at a memory ad-dres. A special instruction in this class is the getelementptr instruction indexing into a com-pound type stored in memory. It can bethought of as the dereferencing operator in C. Cmp ( R ) This class of instructions are used for com-paring the values of registers. As an example, % res = cmp ule i32 % inp1 , % inp2 compares if % inp1 is less than or equal to % inp2 while interpreting % inp1 and % inp2 as unsigned integers. Term ( R , BL ) This class consists of instructionsterminating a block. A terminating action caneither be a jump to another block or a returnfrom a function. For jumping there are twodiﬀerent version: The unconditional version br label % block that jumps to the speciﬁed blockno matter what, and the conditional br i % cond , label % ttblock , label % ffblock thatjumps to ttblock if the pattern in % cond corre-sponds to true and to ffblock otherwise. Thereare also two return instructions: an instruction( ret void ) that does not return a value and onethat does ( ret ty % res ). CInst ( R , Fs ) Instruction for calling otherfunctions. The nstruction for call-ing a function with name @ func is % res = call ret @ func ( ty1 % ^p1 . . . tyn % ^pn ) . Asone would expect, this pass control to thefunction @ func , passes % ^p1 . . . % ^p1 as parametersand stores the result of the function call into % res . Phi ( R , BL ) The instruction class Phi ( R , BL ) consistsof instructions selecting a value based on whichbasic block control ﬂowed from. The instruc-tions are needed, because LLVM -programs arein single-static-assignment form. The instruc-tions are only allowed in the start of a basicblock and must be executed simultaneously i.e.the evaluation of one phi-instruction cannot af-fect the result of another in the same block.

Intrin ( R ) This class is a set of “extension in-structions” used by Lodin . Currently it onlyconsists of instructions that returns a non-deterministic value.4 define dso_local i32 @main () { init: %1 = call i32 (...) @__VERIFIER_nondet_int() %2 = icmp ne i32 %1, 0 br i1 %2, label branch , label end branch : %4 = add nsw i32 %1, 1 br label end end: %.0 = phi i32 [ %4, branch ], [ %1, init ] ret i32 %.0 } LLVM -Listing 2: Example program for using phi i32

Remark 1.

All instructions in Table 1 can takeconstants as parameters in addition to real registers.For ease of exposition we will, however, treat con-stants as standard registers.

Formal Deﬁnitions of LLVM Modules

In theintroduction to this section, we mentioned thatLLVM programs consists of functions (of whichsome may be program entry points) and functionsconsists of basic blocks. We are now turning towardsgiving propert formal deﬁnitions of these concepts.

Deﬁnition 1 (Basic Block) . Let BL be a set of la-bels, Fs be a set of functions names and R be a set ofregisters, then a basic block, B , is a ﬁnite sequence I I . . . I n of instruction where • for all i < n , I i ∈ BInst ( R ) ∪ CInst ( R , Fs ) ∪ Phi ( R , BL ) , • I n ∈ Term ( R , BL ) and • if I i ∈ Phi ( R , BL ) then ∀ j < i , I j ∈ Phi ( R , BL ) .We denote the set of all possible basic blocks over BL , R and Fs by BB ( R , BL , Fs )As a convention, if B = I I . . . I n is a basic blockthen we write | B | = n for its length and we let B [ i ] = I i . Deﬁnition 2 (Function) . A function F with n paramters over the function names Fs is a tuple ( @ N , R , P , BL , BBs , Bm , ret ) where • @ N ∈ Fs is the functions name, • R is a set of registers, • P = p , . . . , p n where for all i , p i ∈ R , is a se-quence of registers used as parameters, • BL is a ﬁnite set of labels with the requirementthat init ∈ BL , • BBs ⊆ BB ( R , BL , Fs ) is a ﬁnite set of blocks, • Bm : BL → BBs assigns each block label a basicblock and • ret ∈ T is the return type of the function. Deﬁnition 3 (Program Entry Point) . A programentry point is a function ( @ N , R , ∅ , BL , BBs , Bm , void ) . Deﬁnition 4 (Module) . An LLVM module M isa tuple ( F , E ) where • F = { F , . . . F n } is a collection of functionswhere ∀ i, F i = ( @ N i , R i , P i , BL i , BBs i , Bm i , ret i ) ,and for all k = j , R k ∩ R j = ∅ and • E = k , . . . , k m is a list of indices deﬁning theentry functions i.e. ∀ ≤ i ≤ m, F i is an entrypoint function. For module M = ( F , E ) we abuse notationslightly and allows writing F ∈ M whenever F ∈ F . Well-typedness

For each register in % r ∈ R weassign a type from t ∈ T and write % r : t to denotethat % r has type t . If a list of registers % , . . . , % n hasthe same type ty , we write % , . . . , % n : ty . Gener-alising this notation to an instruction Inst , we write

Inst : ty to denote Inst is well-typed with type ty .Figure 2 shows the type rules of LLVM instruc-tions. For a function F = ( @ N , R , P , BL , BBs , Bm , retty )we write Rets ( F ) to get all return instructionswithin that functions basic blocks. Given this wesay that F is well-typed ( F : retty ) if for all Inst ∈ Rets ( F ), Inst : retty and all other instructions arewell-typed.5 rith ( R ) % res = add ty % inp1 , % inp2 % res = sub ty % inp1 , % inp2 % res = mul ty % inp1 , % inp2 % res = udiv ty % inp1 , % inp2 % res = sdiv ty % inp1 , % inp2 % res = urem ty % inp1 , % inp2 % res = srem ty % inp1 , % inp2 Log ( R ) % res = shl ty % inp1 , % inp2 % res = lshr ty % inp1 , % inp2 % res = lahr ty % inp1 , % inp2 % res = and ty % inp1 , % inp2 % res = or ty % inp1 , % inp2 % res = xor ty % inp1 , % inp2 Mem ( R ) % res = alloca ty % res = getelementptr ty , ty ∗ % ptr , ty1ind1 . . . , tynindn % res = load ty , ty ∗ % addr store ty % val , ty ∗ % addr Cmp ( R ) % res = cmp eq ty % inp1 , % inp2 % res = cmp ne ty % inp1 , % inp2 % res = cmp uge ty % inp1 , % inp2 % res = cmp ugt ty % inp1 , % inp2 % res = cmp ule ty % inp1 , % inp2 % res = cmp ult ty % inp1 , % inp2 % res = cmp sge ty % inp1 , % inp2 % res = cmp sgt ty % inp1 , % inp2 % res = cmp sle ty % inp1 , % inp2 % res = cmp slt ty % inp1 , % inp2 Term ( R , BL ) ret void ret ty % resbr label % block br i % cond , label % ttblock , label % ffblock Phi ( R , BL ) % res = phi ty [ % inp1 , % lab1 ] . . . [ % inpn , % labn ] CInst ( R , Fs ) % res = call ret @ func ( ty1 % ^p1 . . . tyn % ^pn ) Intrin ( R ) % res = lodin ndty Table 1: Basic instructions over a set of registers R and basic block names BL , where % cond , % res , % inp1 , . . . , % inpn , ∈ R , block , ttblock , ffblock , lab1 , . . . , labn ∈ BL , @ func ∈ Fs and for all i , indi ∈ Z . Modelling External Dependencies

A commonproblem in software veriﬁcation is that the systemwe want to verify depends on external library func-tions (e.g. libc ), or functions interacting directlywith the operating system (e.g. pthread ). In prin-ciple we could extend the

LLVM language with im-plementations for all these external function callsbut it would unnecessarily inﬂate the semantics, andthe semantics would have to be redeﬁned for eachexternal library and operating system.

Lodin combats this problem in two ways:1.

Lodin extends the

LLVM language with the % = lodin ndty instruction that returns non-deterministic values, allowing a programmer to re-place external function calls with % = lodin ndty and thereby explore all possible results of externalfunction calls, and 2. Lodin allows programmers toextend the

Lodin interpreter through platform plu-gins that provide implementations of external func-tions. Calls to external function calls are syntacti- cally indistinguishable from function deﬁned in the

LLVM module itself.

Lodin has been developed with reusability in mindallowing to use core components for both explicitstate analysis and symbolic state analysis. Thesemantics we present in the following reﬂect thisreusability by deﬁning the core semantics in terms ofa context . The context is responsible for represent-ing the register values, how memory is representedand for implementing operations on registers. Thecore semantics “just” translate the

LLVM instruc-tion set to operations on context states and keepstrack of the control ﬂow. In some sense one couldconsider the context being a “virtual machine”.A context provides the

LLVM program with aninﬁnite set of register variables which the contextmaps to actual values. The intention is that a6 inary ty ∈ T int % res , % inp1 , % inp2 : ty ( % res = inst ty % inp1 , % inp2 ) : ty Compare % res : i % inp1 , % inp2 : ty ty ∈ T int ( % res = cmp cc ty % inp1 , % inp2 ) : i Alloca % res : ty ∗ % res = alloca ty : ty ∗ Load % addr : ty ∗ % res : ty % res = load ty , ty ∗ % addr : ty Store % val : ty % addr : ty ∗ store ty % val , ty ∗ % addr : void Phi % res , % lab1 , . . . , % regn : ty % res = phi ty [ % inp1 , % lab1 ] . . . [ % inpn , % labn ] : ty Ret1 ret void

Ret2 % res : tyret ty % res : ty Branch1 br label % block : void Branch2 % cond : i br i % cond , label % ttblock , label % ffblock : void NonDet % res : ty % res = lodin ndty : ty Call % res : ret (cid:2) % pi , % ^pi : tyi (cid:3) i =1 ...n % res = call ret @ func ( ty1 % ^p1 . . . tyn % ^pn ) GEP % res : res ∗ res = T ind ... ind n ( ty ) % res = getelementptr ty , ty ∗ % ptr , ty1ind1 . . . , tynindn : res ∗ Figure 2: Type rules for

LLVM for which we have ( % res = inst ty % inp1 , % inp2 ∈ Arith ( R ) ∪ Log ( R ) and( % res = cmp cc ty % inp1 , % inp2 ) ∈ Cmp ( R ) LLVM program maps

LLVM registers to contextregister variables i.e. uses a redirection table to ob-tain the values of the

LLVM registers. This doesend up complicating the semantics slightly, but al-lows calling a function twice in the

LLVM programi.e. enables recursion.

Deﬁnition 5 (Context) . A context is a tuple A =( S A , s init , dom A , R , ff A ) where • S A is a set of conﬁguration states for the con-text, • s A init ∈ S A is the initial context state, • dom A assigns to each ty ∈ T a range of valuesthat type can attain values within, • R is an inﬁnite set of register variables, • ff A ∈ dom A ( i ) is a representation for “false”. A collection of operations are needed for a

LLVM program to manipulate the states of a context. Mostof these operations are just semantical functions for

LLVM instructions (see Table 2). Instead of writing ◦ ( S, t , t ) = R when applying an operator, we usean inﬁx notation J t ◦ t K S = R . Besides the instruc-tions in Table 2 we need instructions for creatingnew register variables ( mReg ) , evaluate the valueof a register variable ( Eval ty A ), loading ( load ty A ) andstoring ( store ty A ) values from/to memory, allocatingmemory ( alloc ty A ) and free’ing memory ( free ) . Wediscuss them brieﬂy in the following from a usage-perspetice: mReg A : S A × R → S A × R This function takesa context state s A and a register % r , where % r : ty . It returns a register variable r ∈ R that can beused to store values of ty and a new context state s .Naturally, the context must ensure that the registervariable r is not already used in s A .7nstruction Operator SignatureAddition add + ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Subtraction sub − ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Multiplication mul · ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Unsigned Division div / ty u A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Signed Division sdiv / ty s A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Signed Remainder rem % ty s A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Unsigned Modulo srem % ty u A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Shift left shl << ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Logical Shift right lshr >> ty a A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Arithmetic shift right ashr >> ty a A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Bitwise and and & ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Bitwise or or | ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Bitwise xor xor & ty A S A × dom A ( ty ) × dom A ( ty ) → dom A ( ty ) Equality cmp eq == ty A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Non-equality cmp ne = ty A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Signed Greater than cmp sgt > ty s A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Signed Greater than or equal cmp sge > ty s A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Signed Lessr than or equal cmp sle ≤ ty s A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Signed Less than cmp slt < ty s A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Unsigned Greater than cmp ugt > ty u A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Unsigned Greater than or equal cmp uge > ty u A S A × dom A ( i ) × dom A ( ty ) → S A × dom A ( ty ) × {⊤ , ⊥} Unsigned Less than or equal cmp ule ≤ ty u A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Unsigned Less than cmp ult < ty u A S A × dom A ( ty ) × dom A ( ty ) → S A × dom A ( i ) × {⊤ , ⊥} Table 2: Operations for a context A = ( S A , s init , dom A , R ). They each take as input a context state andoperands and returns a new contet states and a return value. The compare instructions also return a valuein {⊤ , ⊥} . Eval ty A : S A × R → dom A ( ty ) This function takesa context state s and register variable r ∈ R , andreturns a value in dom A ( ty ). Set ty A : S A × R × dom A ( ty ) → S A This functiontakes a context state s , register variable r ∈ R withtype ty and a value v ∈ dom A ( ty ). It returns a newcontext state s ′ with r bound to the value v . load ty A : S A × dom A ( ty ∗ ) → dom A ( ty ) This functiontakes a context state s and a memory address in dom A ( ty ∗ ) and returns a subset of dom A ( ty ). store ty A : S A × dom A ( ty ) × dom A ( ty ∗ ) → S A Thisfunction takes a context state s and values v ∈ dom A ( ty ) and a ∈ dom A ( ty ∗ ) . It returns a new s ′ where the value the memory address a has been up-dated to the value v . alloc ty A : S A → S A × dom A ( ty ∗ ) This function takesa context state s and returns a tuple ( s ′ , t ) where t ∈ ty ∗ is a newly allocated memory address withspace for a type ty , and s ′ is a new context stateupdated with information that t is no longer freefor allocation. free A : S A × S ty ∈ T dom A ( ty ∗ ) → S A This func-tion takes a context state s and a value in k ∈ i ∈ B dom A ( i i ∗ ). It returns a new context state s ′ where the memory pointed to by k has been re-leased. NonDet ty A : S A → S A × dom A ( ty ) This functiontakes a context state s and returns a subset of dom A ( ty ) and a new context state. PtrAdd A ty ∗ A : dom A ( ty ∗ ) × Z → dom A ( ty ∗ ) Thisfunction takes a pointer p and natural number b andreturns a pointer new pointer after adding b bytesto p . Core Semantics

We are now ready to deﬁne the core semantics fora single

LLVM process relative to a given context.The state of a single process (e.g. instruction tobe executed, what function it is executing, whichblock was previously executed, mapping the func-tions register to context register variables) is keptin an activation record. The activation record alsohas a list of memory addresses, that must be deal-located when control leaves the currently executingfunction. If a function calls another function, an ac-tivation record is pushed in front of the current onethus forming a stack of activation record.

Remark 2.

An activation record roughly corre-sponds to the well-known concept of a stackframe.

LLVM does however not assume the existence ofa stack and rather in the activation keeps a set ofmemory addresses that must be relased when remov-ing the activation record (corresponding to poppingthe stackframe in stack-based systems).

Deﬁnition 6 (Activation Record) . Anactivation record, relative to a con-text ( S A , s A init , dom A , R , ff ) is a tuple ( F , prev , cur , pc , π, Free ) where • F = ( @ N , R , P , BL , BBs , Bm , ret ) is the LLVM function currently being executed, • prev ∈ BL is the label of the block executed beforethe current one, • cur ∈ BL is the label of the currently executedbasic block, • pc ∈ N is a pointer into the current basic blockto locate the next instruction to be executed, • π : R → R maps registers to register variablesof the context and • Free is a set of memory addresses that must bedeleted when removing this activation record.

Remark 3.

Intuitively, an activation record is splitinto two parts: 1. A static part that indicates whichinstruction to be executed, given by F , prev , cur and pc , and 2. a dynamic part that links the process tothe memory model of the context, given by π and Free . A stack of activation records is a structure s : s · · · : s n where each s i is an activation record. Theempty stack is denoted by ǫ . In the transition rulesin Figure 3-7, we usually use the notation s : SL meaning that s is the head of the stack and SL is theremaining part of the stack. We also write Inst def =( EXPR ) to denote that

Inst is syntactically equivalentto

EXPR . The transition rules are deﬁned relative to acontext state s and a module. Given a context state s ′ and module M the rules deﬁne how to executean instruction Inst from state ( s, SL ), where s is anactivation record and SL is a stack, to produce thetuple (( s ′ , SL ′ ) , s ) where ( s ′ , SL ′ ) is a new state and s is a new context state. We write this as s , M ⊢ ( s, SL ) Inst −−−→ (( s ′ , SL ′ ) , s ′ ) . The rules may look intimidating but most of themare fairly straightforward. As an example let usbrieﬂy consider the rule for binary operators (thatare not comparisons) i.e.

Binary

Inst = Bm ( cur )[ pc ] v ∈ J Eval ty A ( s , r )) ◦ ( inst ) Eval ty A ( s , r ) K s s , M ⊢ ( s, SL ) Inst −−−→ (( F , prev , cur , pc + 1 ,π, Free ) , SL ) , Set ty A ( s , r res ,v ) , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =(% res = instty % inp1 , % inp2 ) r = π (% inp1 ) r = π (% inp2 ) r res = π (% res ) ) This rule says, that in order to execute an in-struction % res = inst ty % inp1 , % inp2 we ﬁrst ﬁgureout which register variables in s that contain thevalues of % inp1 , % inp , % res . This look up is done with9 lloc Inst = Bm ( cur )[ pc ] alloc ty A ( s ) = s ′ ,v s , M ⊢ ( s, SL ) Inst −−−→ (( F , prev , cur , pc + 1 ,π, Free ∪ { m } ) , SL ) , Set ty ∗A ( s ′ , r res ,v ) , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =(% res = allocaty ) r res = π (% res ) Load

Inst = Bm ( cur )[ pc ] v ∈ load ty A ( s , Eval ty ∗A ( s , r addr )) s , M ⊢ ( s, SL ) Inst −−−→ (( F , prev , cur , pc + 1 ,π, Free ) , SL ) , Set ty A ( s , r res ,v ) , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =(% res = loadty , ty ∗ % add ) r addr = π (% addr ) r res = π (% res ) Store

Inst = Bm ( cur )[ pc ] store ty A ( s , Eval ty A ( s , r val ) , Eval ty ∗A ( s , r addr )) = s ′ s , M ⊢ ( s, SL ) Inst −−−→ (( F , prev , cur , pc + 1 ,π, Free ) , SL ) , s ′ , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =( storety % val , ty ∗ % addr ) r val = π (% val ) r addr = π (% addr ) GEP

Inst = Bm ( cur )[ pc ] s , M ⊢ ( s ′ , SL ) Inst −−−→ ( s ′′ , SL ) , s s , M ⊢ ( s, SL ) Inst −−−→ ( s ′′ , SL ) , Set ty A ( s , r res , PtrAdd A ( Eval ty ∗A ( s , r ptr )) ,k ) , s =( F , prev , cur , pc ,π, Free ) ,s ′ =( F , prev , cur , pc +1 ,π, Free ) , F =(@ func , R , P , BL , BBs , Bm , ret ) Inst def =(% res = getelementptrty , ty ∗ % ptr , ty1ind1 ..., tynindn ) r ptr = π (% ptr ) r res = π (% res ) k = T ind2 ,..., indn ( ty )+ ind1 · BSize ( ty ) Figure 3: Transition Rules for memory instructionscalls to π and results kept in r , r , r res . Then we eval-uate the value of r and r in s via calls to Eval ty A ,and the operation corresponding to inst is looked upwith ◦ (see Table 2 for this mapping) and applied ( J Eval ty A ( s , r )) ◦ ( inst ) Eval ty ( s , r ) K s ) giving a newcontext state ( s ′ ), and the value of the operation( v ). Set ty ( s ′ , r res , v ) stores this new value in r res andreturns the new context state. Finally we updatethe program counter ( pc + 1).In the rules special care has to be taken for the phi ty instructoins. All of these must be evaluated si-multaneously. We therefore evaluate the them in abig-step fashion where the evaluation of one instruc-tion also result in evaluating the next instruction (ifit is also a phi ty instruction). For the getelementptr rule, we use the auxillary function T i ,...i n ( h ty1 , . . . , tyn i ) = i − X k =1 BSize ( tyk ) + T i ,...i n ( tyi ) T ǫ ( ty ) = 0to calculate the oﬀset needed to access the correctelement of the designated type. Remark 4. If Lodin has some functions deﬁnedin a platform plugin, the call rule in Figure 7 is re-placed by the implementation described in that mod-ule instead. Platform functions are executed atomi-cally in

Lodin . Branch Unconditional

Inst = Bm ( cur )[ pc ] s , M ⊢ ( s, SL ) Inst −−−→ (( F , cur , block , , π, Free ) , SL ) , s , s =( F , prev , cur , pc ,π, Free ) Inst def =( brlabel % block ) F =(@ N , R , P , BL , BBs , Bm , ret ) Branch Conditional True

Inst = Bm ( cur )[ pc ] J Eval ty A ( s ,π ( % cond )) = ty A ff A K s = s ′ , , ⊤ s , M ⊢ ( s, SL ) Inst −−−→ (( F , cur , ttblock , ,π, Free ) , SL ) , s ′ , s =( F , prev , cur , pc ,π, Free ) Inst def =( bri % cond , label % ttblock , label % ffblock ) F =(@ N , R , P , BL , BBs , Bm , ret ) Branch Conditional False

Inst = Bm ( cur )[ pc ] J Eval ty A ( s ,π ( % cond )) == ty A ff A K s = s ′ , , ⊤ s , M ⊢ ( s, SL ) Inst −−−→ (( F , cur , ffblock , ,π, Free ) , SL ) , s ′ , s =( F , prev , cur , pc ,π, Free ) Inst def =( bri % cond , label % ttblock , label % ffblock ) F =(@ N , R , P , BL , BBs , Bm , ret ) Return Void

Inst = Bm ( cur )[ pc ] [ s i = free A ( s i − , f i )] i =1 ...n s , M ⊢ ( s, SL ) Inst −−−→ ( s ′ , SL ′ ) , s n , s =( F , prev , cur , pc ,π, Free ) , Free = { f ,f ,...,f n } Inst def =( retvoid ) SL = s ′ : SL ′ F =(@ N , R , P , BL , BBs , Bm , ret ) Return Value

Inst = Bm ( cur )[ pc ] [ s i = free A ( s i − ,f i )] i =1 ...n s , M ⊢ ( s, SL ) Inst −−−→ ( s ′ , SL ′ ) , Set ty A ( s n , r v , Eval ty A ( s ,π ( % val ))) , s =( F , prev , cur , pc ,π, Free ) , Free = { f ,f ,...,f n } Inst def =( retty % val ) SL = s ′ : SL ′ F =(@ N , R , P , BL , BBs , Bm , ty ) s ′ =( F ′ , prev ′ , cur ′ , pc ′ ,π ′ , Free ′ ) F ′ =(@ func ′ , R ′ , P ′ , BL ′ , BBs ′ , Bm ′ , ret ′ ) , Inst c def =(% res = callty @ N ( ty1 % ^p1 ... tyn % ^pn )) Bm ′ ( pc ′ − Inst c r v = π ′ (% res ) Figure 4: Transition rules for terminator instruc-tions

Phi

Inst = Bm ( cur )[ pc ] Bm ( cur )[ pc +1] ∈ Phi ( R , BL ) s , M ⊢ ( s ′ , SL ) Inst −−−→ ( s ′′ , SL ) , s s , M ⊢ ( s, SL ) Inst −−−→ ( s ′′ , SL ) , Set ty A ( s , r res , Eval ty A ( s , r inp )) , s =( F , prev , cur , pc ,π, Free ) ,s ′ =( F , prev , cur , pc +1 ,π, Free ) , F =(@ func , R , P , BL , BBs , Bm , ret ) Inst def =(% res = phity [% inp1 , % lab1 ] ... [% inpn , % labn ]) ∃ i, labi = prev r inp = π (% inpi ) r res = π (% res ) Phi2

Inst = Bm ( cur )[ pc ] Bm ( cur )[ pc + 1] / ∈ Phi ( R , BL ) s , M ⊢ ( s, SL ) Inst −−−→ ( s ′ , SL ) , Set ty A ( s , r res , Eval ty A ( s , r inp )) , s =( F , prev , cur , pc ,π, Free ) ,s ′ =( F , prev , cur , pc +1 ,π, Free ) , F =(@ func , R , P , BL , BBs , Bm , ret ) Inst def =(% res = phity [% inp1 , % lab1 ] ... [% inpn , % labn ]) ∃ i, labi = prev r inp = π (% inpi ) r res = π (% res ) Figure 5: Compare Rules for Phi instructions

Compare

Inst = Bm ( cur )[ pc ] J Eval ty A ( s , r ) op ty Eval ty A ( s , r ) K s = ,v, s , M ⊢ ( s, SL ) Inst −−−→ ( F , prev , cur , pc +1 ,π, Free ) , SL ) , Set ty A ( s ,π ( % res ) ,v ) , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =(% res = cmpcondty % inp1 , % inp2 ) γ ty ( cond )=( op, ) r = π (% inp1 ) r = π (% inp2 ) Figure 6: Compare Rules for comparison instruc-tions10 inary

Inst = Bm ( cur )[ pc ] [ s i , g i = mReg A ( s i − , r i ) ∧ π i = π i − [ r i g i ]] i =1 ...m [ s m + i +1 = Set ty i A ( s m + i , π m ( p i ) , v i )] i =0 ...n s , M ⊢ ( s, SL ) Inst −−−→ (( F ′ , init , init , , π m , ∅ )) , s ′ : SL , s =( F , prev , cur , pc ,π old , Free ) s ′ =( F , prev , cur , pc +1 ,π old , Free ) Inst def =( callret @ func ( ty % ^p ... ty n % ^p n )) ∀ i,v i = Eval ty i A ( s ,π old (% ^p i )) , F ′ =(@ func , R , P , BL , BBs , Bm , ret ) ∈M , P = { p ,..., p n − } , R = { r ,..., r m } ,π ′ : R →R NonDet

Inst = Bm ( cur )[ pc ] NonDet ty A ( s ) = V, s ′ v ∈ V s , M ⊢ ( s, SL ) Inst −−−→ (( F , prev , cur , pc + 1 , π, Free ) , SL ) , Set ty A ( s ′ , r res , v ) , s =( F , prev , cur , pc ,π, Free ) F =(@ N , R , P , BL , BBs , Bm , ret ) , Inst def =(% res = lodin ndty ) r res = π (% res ) Figure 7: Miscellaneous rules Rules define void @stub () { init: call void @N () br label %loop loop: br label %loop ret void } LLVM -Listing 3: Stub function ( stub F for instanti-tating an entry point F = ( @ N , R , P , BL , BBs , Bm , void ) Network of Processes

Let M = ( F , E ) be an LLVM module where F = { F , . . . , F n } with F i =( @ N i , R i , P i , BL i , BBs i , Bm i , ret i ) and E = { k , . . . , k m } and let A = ( S A , s init , dom A , R , tt A , ff A ) be acontext. We deﬁne the transition system L AM =( N , n , −→ A ) where a state n ∈ N is a tuple n =( s , s , . . . , s m , s , M ) where each s i is a state of aprocess and s ∈ S A .A state n = ( s , s , . . . , s i , . . . , s m , s , M ) maytransit to state n ′ = ( s , s , . . . , s ′ i , . . . , s n , s ′ , M ) viathe i th component performing an instruction Inst if s , M ⊢ s i Inst −−−→ s ′ i , s ′ . We write this as n Inst −−−→ i A n ′ .The initial state n is(( κ , ǫ ) , . . . , ( κ m , ǫ ) , s init , M ) where κ i =( stub F ki , init , init , , , ∅ ) and stub ( F k i ) is aspecial stub function shown in LLVM -Listing 3.

In the preceding section we developed the semanticsof

LLVM programs abstractly i.e. we deﬁned an“interface” to a context of the semantics, allowinginstantiating diﬀerent semantics by modifying theinstantiation of this interface. In this section wedevelop two instantiations ( E and S ) of the interface.The resulting transition semantics for module M , L EM ( L SM ), we call the explicit (symbolic) semantics. Bitvectors

Let B = { , } then a bitvector ofwidth n is an element in B n . Two special bitvectorsare ~ n = (0 , , . . . , ∈ B n and ~ n = (1 , , . . . , ∈ B n . If ~b = ( b , b , . . . , b n − ) ∈ B n is a bitvector, thenwe can access individual bits by indexing into ~b i.e. ~b [ i ] = b i . We also allow extracting the sub-vector( b i , . . . , b j ) by ~b [ i : j + 1]. If ~b = ( b , b , . . . , b n − ) ∈ B n , ~c = ( c , . . . , . . . c i − ) ∈ B i , k ∈ { , n − } and k + i < n then we let ~b [ k : k + i/~c ] = ( b , b , . . . , b k − , c , . . . , c i − , b k + i , . . . , b n − ) . Let ~b = ( b , b , . . . , b n − ) ∈ B n be a bitvector,then we can interpret it as either an unsigned integeror a signed integer. In the prior case we use the stan-dard binary encoding and deﬁne h ~b i = P ni =0 b i · i . In the latter case we use 2’s-complement encoding11nd let h· ~b ·i = − b n − n − + P n − i =0 b i i . To encodea number n ∈ N in either binary or 2s-complementwe write h n i − and h· n ·i − respectively.The classic bitwise operators, and, or, xor andnegation, between vector ~b , ~b ∈ B n are deﬁned asusual and denoted ( ~b and ~b ), ( ~b or ~b ), ( ~b xor ~b )and ( neg ~b ) respectively. If ~b ∈ B n is a bitvector, d ∈ N is a number and d < n then we deﬁne bitshifting operations as ~b lshl d = ~ n [ d : n/ ~b [0 : n − d ]] ,~b lshr d = ~ n [0 : d/ ~b [ n − d : n ]] ~b ashr d = ( ~ n [0 : d/ ~b [ n − d : n ]] if ~b [ n −

1] = 0 ~ n [0 : d/ ~b [ n − d : n ]] if ~b [ n −

1] = 1The lshl ( lshr ) operator is a logic left (right)bitshift i.e. shift all bits to left (right) and pad withzero. The ashr is arithmetic right shift where in-stead of padding with zero, the bit vector is paddedwith the original value of the most signiﬁcant bit. Memory Modelling

In the explicit semantics wemodel the memory state of a computer as a (possi-bly) inﬁnite length array of memory blocks. Mem-ory blocks are tagged with their size and the actualcontent of the block. Formally, the memory stateof program is a function M : N → ( N × ( S i ∈ N B i )) ∪{ a } ). An entry M B ( i ) means that block i of thememory has not been used. If M B ( i ) = ( k,~b ) and ~b ∈ B k then we say that block i is consistent , has k and ~b is the content of that block.To modify and read from memory, we deﬁne thefunctions: • new (( M ) , i ) = ( M [ n ( i,~ i )]) , n where n = min ( { g | M ( g ) = a } ), • Memfree (( M , Used ) , i ) = ( M [ i a ]), • read ( M , b, f, len ) = ~b [ f : f + len ] where M ( b ) =( i,~b ) and f + len < i and M ⊥ ⊥ ... block ( size , · )... offset . . . size Figure 8: Memory representation in

Lodin . Point-ers are 64bit integeres split into a 32bit base and a32bit offset . Lodin uses a redirection table ( M )that store memory blocks, and block indexes intothis table, while offset indexes into the memoryblocks. The symbol ⊥ indicates an entry in M isunused. • write ( M , b, f, ~c, len ) = ( M [ b ~b [ f : f + len/~c ] , Used ) where M ( b ) = ( i,~b ) and f + len < i andThe initial state of the memory is the function M init where for all i , M init ( i ) = a .Given both a representation of the register valuesand the memory, we can now deﬁne the explicit con-text. In the explicit context, we assign to a type in the domain B n and any pointer type is assigned thedomain B . Using a 64-bit bitvector for represent-ing pointers allows us to use the 32 most signiﬁcantfor indexing into M of the memory and use 32 leastsigniﬁcant bits to index into the actual block. Fora pointer p ∈ B we let block ( p ) = p [32 : 64] and offset ( p ) = p [0 : 32]. See Figure 8 for a graphicaldepiction of how this work. Deﬁnition 7 (Explicit Context) . The explicit con-text is the tuple E = ( S E , s E init , dom E , N , ff E ) where • S E = { ( M , N, F ) | M is a memory state ∧ N ⊂ N ∧ F : N → ( S i ∈ N B i ) ∪ {⊥}}• s E init = ( m init , ∅ , F ) where for all i , F ( i ) = ⊥ , • dom E ( t ) = B · BSize ( t ) • ff E = h i − . Reg E (( M , N, F ) , % r ) = ( M , N ∪ { i } , F ) , i where i = min ( N \ N ) Eval ty E (( M , N, F ) , i ) = ( F ( i ) if F ( i ) ∈ dom E ( ty ) Error

Otherwise alloc ty E (( M , N, F )) = ( M ′ , N, F ) , i if ty = i n ∧ new ( M , BSize ( ty )) = M ′ , i free E ((( M , Used ) , N, F ) , i ) =  ( Memfree (( M , Used ) , k ) , N, F )) if i ∈ B k = h i [32:64] i∈ Used h i [0:32] i =0 Error otherwise load ty E ((( M , Used ) , N, F ) , i ) =  { ((( M , Used ) , N, F ) , read (( M , Used ) , k, o, m )) } if k = h i [32:64] i∈ Useddom E ( ty )= B m o = h i [0:32] i∗ s,b )= M ( k ) o + m ty u E and > ty s E below, and note that theremaining comparison operators are easily gener-alised from these. In the rules we let tt E ∈ dom E ( i )and require tt E = ff E . > ty u E ( s , r , r ) = ( ( s , tt E , ⊤ ) if h r i > h r i , ( s , tt E , ⊥ ) otherwise > ty s E ( s , r , r ) = ( ( s , tt E , ⊤ ) if h· r ·i > h· r ·i ( s , ff E , ⊥ ) otherwise Remark 5.

Instantiating a model with the explicitcontext as described so far result in a possibly inﬁ-nite state space. As a result, an exhaustive enumer-ation of all possible states may not terminate.

We have already mentioned that an explicit rep-resentation of values in a program will explode(even without concurrency) in the presence of non-deterministic values. As an example of this, con-sider

LLVM -Listing 4 which can call the function @@ error if and only % is set to 5. It is easy for easyfor humans to realise that @@ error can be called, buta computer with an explicit representation has toenumerate all 32 − % .For combatting this, Lodin provides a symboliccontext representation. Instead of representing val-ues explicitly, the symbolic context gathers all oper-ations performed during exploration into one largelogical formula - known as the path formula - thatcan since be passed to an SMT-solver. The SMT-solver can then determine if the formula is satisﬁableand thus if the explored path is feasible.

An SMT-instance is principally a ﬁrst order logicformula where some predicates and functions have define void @main () init: %1 = alloca i32 , align 4 %2 = lodin_nd i32 store i32 %2, i32* %1, align 4 %3 = load i32 , i32* %1, align 4 %4 = icmp eq i32 %3, 5 br i1 %4, label %call , label %done call: ; preds= %0 call void (...) @error () br label %done done: ;preds = %5 , %0 ret void } LLVM -Listing 4: Example of why Symbolic Repre-sentation are necessaryspecial interpretations. These special interpreta-tions are encapsulated into what is called theories.An SMT-instance of the theory T can be determinedto be satisﬁable or not satisﬁable by SMT-solversupporting the T . We will not invest too much timehere in talking about how SMT-solvers work, butwill rather informally discuss the theories we need. Theory of Bitvectors

In the theory of bitvec-tors, variables are given a bitvector type i n . Theoperations that can be performed between bitvec-tors are • the classic bitwise operations, i.e. and , or , neg , xor , lshl , lshr and ashr • arithmetic operations (modulo 2 n ), i.e. add , sub , div u , div s , mul , rem u , rem s - as inthe LLVM discussion we need both signed andunsigned versions of some operations (indexedby u and s ) • comparisons e.g. = and ≤ , • boolean operations e.g. ( ∧ , ∨ , ¬ ) • concatenation of bitvectors ◦ , Note we reuse the type name from

LLVM ty E ( S, ~b , ~b ) =  { ( h b + b i − ) } if b i = h ~b i i b + b ≤h ~ m i − { ( h ( b + b )%2 m i − ) } otherwise PtrAdd E ( ~b , k ) = ~b where block ( ~b ) = block ( ~b ) and offset ( ~b ) = offset ( ~b ) + k − ty E ( S, ~b , ~b ) =  { ( h b + b i − ) } if b = h ~b i b = h neg ~b i +1 b + b ≤h ~ m i − { ( h ( b + b )%2 m i − ) } if b = h ~b i b = h neg ~b i +1 · ty E ( S, ~b , ~b ) =  { ( h b ∗ b i − ) } if b i = h ~b i i b ∗ b ≤h ~ m i − b i = h ~b i i { ( h ( b ∗ b )%2 m i − ) } otherwise / ty u E ( S, ~b , ~b ) = ( { ( h⌊ b /b ⌋i − ) } if b i = h ~b i i b =0 { ( ~c ) | ~c ∈ B m } otherwise / ty s E ( S, ~b , ~b ) = ( { ( h· trunc ( b /b ) ·i − } if b i = h· ~b i ·i b =0 { ( ~c ) | ~c ∈ B m } otherwise% ty u E ( S, ~b , ~b ) = ( { ( h b − b · ⌊ b /b ⌋i − ) } if b i = h ~b i i b =0 { ( ~c ) | ~c ∈ B m } otherwise% ty s E ( S, ~b , ~b ) = ( { ( h· b − b · ( trunc ( b /b )) ·i − } if b i = h· ~b i ·i b =0 { ( ~c ) | ~c ∈ B m } otherwise << ty E ( S, ~b , ~b ) = ( { ( ~b lshl b ) } if b = h ~b i b > ty l E ( S, ~b , ~b ) = ( { ( ~b ) lshr b } if b = h ~b i b > ty a E ( S, ~b , ~b ) = ( { ( ~b ashr b ) } if b = h ~b i b
We reuse the operatorions from ourdiscussin of bitvectors in subsection 3.1, and re-quire that the SMT-solver implements the seman-tics of the operations as described there. Likewisewe write constant bitvectors using the notation fromsubsection 3.1.

Theory of Arrays

In this theory an array is amapping between elements. Elements from an ar-ray can be read using a select function, and anelement stored in an array using a store function.We introduce the array type { i n } → { i m } mappingelements from i n to i m . If v : { i n } → { i m } , v : i n and v : i m then we write store ( v, v , v ) to createa new array that is equal to v with the only diﬀer-ence that the value of v now maps to the value of v . We also write v = select ( v, v ) to set v equalto the value kept at position v .In the following we use V to denote an inﬁnite setof SMT variables. We also use the restricted sets V ty = { v ∈ V | v : ty } . Similarly we refer by W to all SMT expressions over V and W ty to all SMTexpressions with type ty . The Symbolic Context

The symbolic context in

Lodin maps its registervariables to SMT variables and uses a so called pathformula to capture all constraints (assignments andcomparisons) encountered during a program execu-tion. Memory is represented using a SMT array anda SMT variable points to ﬁrst place in memory thatis free for allocation.

Deﬁnition 8 (Symbolic Context) . The symboliccontext for the symbolic semantics is the tuple S =( S S , s init , dom S , N , ff S ) where • S S are tuples ( v M , v f , N, F, ψ, used ) where – v M : { i } → { i } is an array represent-ing the memory state of the program, – v f : i is a pointer into memory – N ⊆ N is a set of used register variables, – F : N → V ∪ {⊥} – ψ is an SMT formula - the path formula- encoding the constraints that an exploredpath has to satisfy, and – used ⊆ V is a set of used SMT variables. • s init = ( M, , F, ff S == ff S , ∅ ) where for all n ∈ N , F ( n ) = ⊥• dom S ( i i ) = W i i , dom S ( ty ∗ ) = W i , and dom S ( h ty1 , . . . ty1n i ) = W · BSize ( h ty1 ,... ty1n i ) • ff S = ~ . The arithmetic instructions (e.g. + ty S ( s , v , v )that we need to implement for the context isstraightforward to represent. All we need to do isto create an SMT expressions corresponding to theoperation, Below we give a generalised deﬁnition ofthe rule: ∼ ty S (( v M , v f , F, ψ, used ) , v , v ) = v SMTOp v For the mapping between ∼ ty S and SMTOp we refer toTable 3.The comparison operators are very similar tothe binary operator, and below we provide an ex-ample for the > ty u S ( s , v , v ) function where s =( v M , v f , F, ψ, used ) > ty u S ( s , v , v ) =( v M , v f , N, F, ψ ∧ ( v > u v ) , used , v > u v , ⊤ )For the remainder of the operations we refer thereader to Figure 11 and Figure 12. Example 1.

We brieﬂy return to the module ( M )in LLVM -Listing 4 and consider how we can usethe symbolic representation of

Lodin to determineif the function @ error can be called. We simplyinstantiate the symbolic transition system L SM =( N , n S , −→ S ) and generate symbolic states from n S until we reach a state n f = ( s : s · · · : ǫ, s S , M )16 Reg S (( v M , v f , N, F, ψ, used ) , % r ) = ( v M , v f , N ∪ { i } , F, ψ, used ) , i where i = min ( N \ N ) Eval ty S (( v M , v f , N, F, ψ, used ) , i ) = ( F ( i ) if F ( i ) ∈ dom S ( ty ) Error

Otherwise

Set ty S (( v M , v f , N, F, ψ, used ) , l, v ) = ( ( v M , v f , N, F, ψ ∧ ( F ( l ) = v ) , used ) if l ∈ N ∧ v ∈ dom S ( ty ) Error

Otherwise alloc ty S (( v M , v f , N, F, ψ, used )) = ( ( v M , v f ′ , N, F, ψ ∧ ( v f ′ = v f add n ) , used \ { v f } ) , v f if ty = i n ( v M , v f ′ , N, F, ψ ∧ ( v f ′ = v f add , used \ { v f } ) , v f if ty = i n ∗ PtrAdd S ( v b , k ) = v b add k Figure 11: Evaluation and setting registers in symbolic context. load i n S (( v M , v f , N, F, ψ, used ) , i ) = SymbLoad i n ( v M , F ( i )) SymbLoad i n ( v M , v a ) = ( select ( v M , v a ) if i n = i select ( v M , v a ) ◦ SymbLoad i n − ( v M , v a add

1) otherwise store ty S (( v M , v f , N, F, ψ, used ) , v v , v p ) = ( v M ′ , v f , N, F, ψ ∧ ( v M ′ = SymbStore ty ( v M , v v , v p )) , used , v v , v p ) SymbStore i n (( v M , v v , v p )) = ( store ( v M , v p , v v ) if i n = i SymbStore i n − ( store ( v M , v p , v v [0 . . . , v p add , v v [8 . . . n ]) otherwiseFigure 12: Store and Load operations in the symbolic context where s = ( @ main , prev , call , pc , π, Free ) and s S =( v M , v f , N, F, ψ, used ) . Reaching n f reveals thatthere is a path in the control ﬂow graph of @ main that reaches the call-block (and thereby the call in-struction), but not that it is feasible. To ensure thefeasibility, we invoke a SMT-solver and checks if ψ is satisﬁable. If this is the case, we can read thevalue of all registers used along that path from theSMT satisyﬁng assignment. Remark 7.

The symbolic context assigns each reg-ister of an

LLVM program a single SMT-variable,and gathers constraints over these SMT-variables in a path formula. Assignments to

LLVM registers iscaptured by equality between the SMT-variable andSMT-expressions. A result of this is that the sym-bolic context does not support assigning to the sameregister multiple times thus it is only applicable forfor programs without any loops in their control-ﬂow-graph.

Merging Symbolic States

It is usual convenientto merge symbolic context states into one state.This allows exploring several computational pathssimultaneously and helps combat path-explosion17 ty S add − ty S sub · ty S mul / ty u S div u / ty s S div s % ty u S rem u % ty s S rem s << ty S lshl >> ty l S lshr >> ty a S ashr & ty S and | ty S or ⊕ ty S xor Table 3: Mapping between semantical operators andSMT operatorsproblem - which is a big problem for symbolic exe-cution engines such as

Klee .For merging context-states s S = ( v M , v f , N, F, ψ, used )and s ′S = ( v ′ M , v ′ f , N ′ , F ′ , ψ ′ , used ′ )where for all n ∈ N ∩ N ′ it is the case that F ( n ) = F ′ ( n ) we introduce the function merge : S S × S S →S S deﬁned as merge ( s S , s ′S ) = ( v ′′ M , v ′′ f , N ∪ N ′ , F ′′ , ( ψ ∨ ψ ′ ) ∧ ψ ′′ ∧ ψ ′′′ , used ∪ used ′ ∪ { v ′′ M , v ′′ f , v P } )where • F ′′ ( n ) = ( F ( n ) if n ∈ NF ′ ( n ) if n ∈ N ′ • v ′′ M , v ′′ f , v P / ∈ used ∪ used ′ , • ψ ′′ def = ( v ′′ M = ite ( v P , v M , v M ′ )), • ψ ′′′ def = ( v ′′ f = ite ( v P , v f , v f ′ )).Here v P with type i is a fresh SMT variable and ite ( v P , v, v ′ ) evaluates to v ′ if v P = ff S and to v otherwise. Model Checking [1, 8] is a technique widely used inacademia for validating that a formal model of a pro-gram behaves correctly - according to a speciﬁcationgiven by a logical formula. A basic speciﬁcation isa reachability speciﬁcation, where we are interestedin ﬁnding a state where a given proposition is true.This is the main focus in

Lodin , and thus we willlimit our discussion to this setting.

At the core of any reachability checking algorithmis a transition system to search and a set of atomicpropositions. In the case of

Lodin , the state spacewe search is L EM = ( N , n , −→ E ). Atomic proposi-tions of a program are elements that may be true orfalse in a state ( for instance whether x == 5 or ifa state has a DataRace ) . An interpretation (overstates N ) of an atomic proposition, p , is a function P p : N → { tt , ff } , where tt indicates p is true and ff indicates it is false. Atomic propositions may becombined with the classical boolean operators ∧ , ∨ and ¬ . The interpretation of these combined propo-sitions are deﬁned recursively below as, • P ψ ∧ ψ ( n ) = P ψ ( n ) ∧ P ψ ( n ) • P ψ ∨ ψ ( n ) = P ψ ( n ) ∨ P ψ ( n ) • P ¬ ψ ( n ) = ¬P ψ ( s ),where ψ , ψ are combined proposition themselves.Checking reachability for the proposition ψ is now tocheck whether we, from the initial state, can reacha state n where P ψ ( n ) = tt . The classical ap-proach for such a search is the ﬁx-point algorithmin Algorithm 4.For a ﬁnite state system Algorithm 4 obviouslyterminate, as Passed eventually contains the entirereachable state space - and thus no further statescan be put into

Waiting and therefore

Waiting willeventually become ∅ . Equally straightforward is it We deﬁne the exact propositions of

Lodin in a short while ata: Property : φ Data:

Initial state: n Result: ⊤ or ⊥ Passed := ∅ ; Waiting := { n } ; while Waiting = ∅ do Let n c ∈ Waiting ; Waiting := Waiting \ { n c } ; if P φ ( n c ) thenreturn ⊤ end Waiting := Waiting ∪ { n |∃ i, Inst s.t. n c Inst −−−→ i E n } ; Waiting := Waiting \ Passed endreturn ⊥ Algorithm 1:

The classic reachability algorithm.States that has not been explored (but found) arekept in the set

Waiting , and states that has alreadybeen processed are kept in

Passed .to realise that Algorithm 4 produces correct results.Algorithm 4 is non-deterministic in selecting an ele-ment from

Waiting and in generating successors ofthe currently considered state. The latter can easilybe determinisied by generating states in a ﬁxed or-der, while the prior can be determinised in diﬀerentways: the two usual ways is to keep the elements of

Waiting in a stack or on a queue and let the orderinduced by these deﬁne the search order.

Remark 8.

As mentioned earlier, the explicit statespace may in fact be inﬁnite thus Algorithm 4 maynot terminate. In

Lodin we have added optionsfor terminating any veriﬁcation after a user deﬁnedtime or after using a user deﬁned size of memory.

LLVM Propositions

Lodin has support forpropositions specifying classic programming errors(division by zero, data race, out of bounds errors,etc). Furthermore, it is posible to do comparisonsbetween registers and check if a speciﬁc functionis called by a process. The use case for the lat-ter is, that the user can modify the veriﬁed pro-gram to call an error function and check if that h Prop i | = h Compare i | h

Simple ih Compare i | = ( h Comparand ih OP ih Comparand i ) h Comparand i | = h Number i | h

Register ih Number i | = h Integer i ; h Type ih Register i | = @ h Integer i . h String i . % h String i ; h Type ih Type i | = h us i | h us i | h us i | h us i h us i | = ui | si h OP i | = < | < = | > = | > | == | ! = h Simple i | = DataRace | DivZero | OverFlows | [ h Integer i . h String i ] Figure 13: Grammar generating veriﬁcation queriesof

Lodin .function is called . In Lodin s propositional lan-guage, registers and numbers are typed to signedbitvectors or unsigned bitvectors with the suﬃxes ui n and si n where n ∈ { , , , } . For any pro-duction rule R in Figure 13 we write Ψ( R ) for thelanguage generated by that rule. An expression like @ . F . % tmp3 ; ui32 == 3; ui32 , means take register % tmp3 in the function @ F of the 0th process. Interpret it asan unsigned 32bit integer, and compare it for equal-ity with 3 also interpreted as a unsigned 32bit in-teger . For comparisons to make sense, the two ex-pressions being compared must, naturally, have thesame type.For evaluating the value of a register in a state( n = ( s , s , . . . , s n , s , M )), we deﬁne A @ k . F . % tmp ; uin ( n ) =  h Eval ty E ( s , r ) i if s k =((@ F , prev , cur , pc ,π, Free ) , SL ) tmp : i n r = π ( tmp ) ~ n otherwise A @ k . F . % tmp ; sin ( n ) =  h· Eval ty E ( s , r ) ·i if s k =((@ F , prev , cur , pc ,π, Free ) , SL ) tmp : i n r = π ( tmp ) h· ~ n ·i otherwise This modiﬁcation could even be done at compile-time, byreplacing the implementation of the commonly used assert function ui32 ) we write A ui32 ( n ) and it has the obvious implementation.Given these notations, we can deﬁne how proposi-tions are evaluated within Lodin in Figure 14. Ashort discussion may be in order about the evalua-tions in Figure 14. • Division by zero (

DivZero ) are determined inthe obvious manner, where we simply check ifany process executes any instruction involvinga division and check if the second operand iszero. • Buﬀer overﬂows (

OverFlows ) are likewise eas-ily checked by checking if any process accessesmemory, and for each of those that do accessmemory we check if their read/write to mem-ory exceeds the length of the buﬀer they arewriting/reading into/from. • The instruction for checking whether a spe-ciﬁc process number i can call a function func ([ i. func ]), we ﬁrst check if process i performs a call instruction and if so, if the functions beingcalled matches func . • The most diﬀuclt proposition to check is with-out a doubt

DataRace . For evaluating thisinstruction, we iterate over all processes andﬁnds pairs of read/write and write/write to thesamme pointer base. Afterwards we check iftheir offset + length overlaps Example 2.

As a short example of using

Lodin for reachability checking let us consider

LLVM -Listing 1 and consider we are interested inwhether % x and % z can ever be equal. Notice thatsince all phi instructions should be executed atomi-cally in the beginning of a block, this should neverbe possible - thus checking this with Lodin actually div , sdiv , rem , srem P A ⊲⊳ A ( n ) = A A ⊲⊳ A A P DivZero ( n ) =  ⊤ if ∃ s i =(( F , prev , cur , pc ,π, Free ) , SL ) F =(@ F , R , P , BL , BBs , Bm , ret ) Bm ( cur )[ pc ]=% res = DIV ty % inp1 , % inp2DIV ∈{ udiv , sdiv , urem , srem } r = π (% inp2 ) h Eval ty ( s ,r ) i =0 ⊥ otherwise P OverFlows ( n ) =  ⊤ if ∃ s i =(( F , prev , cur , pc ,π, Free ) , SL ) F =(@ F , R , P , BL , BBs , Bm , ret ) Bm ( cur )[ pc ]= store ty % inp1 , ty ∗ % inp2 r = π (% inp2 ) π (% inp1 ) ∈ B l ( len ,v )= M ( block ( r )) offset ( r )+ l> len ⊤ if ∃ s i =(( F , prev , cur , pc ,π, Free ) , SL ) F =(@ F , R , P , BL , BBs , Bm , ret ) Bm ( cur )[ pc ]= store ty % inp1 , ty ∗ % inp2 r = π (% inp2 ) π (% inp1 ) ∈ B l ⊥ = M ( block ( r )) ⊥ otherwise P [ i . func ] ( n )  ⊤ if ∃ s i =(( F , prev , cur , pc ,π, Free ) , SL ) F =(@ F , R , P , BL , BBs , Bm , ret ) Bm ( cur )[ pc ]=% res = call ret @ func ( ty1 % ^p1 ... tyn % ^pn ) ⊥ otherwise P DataRace ( n ) =  ⊤ if ∃ s i =(( F i , prev i , cur i , pc i ,π i , Free i ) , SL i ) F i =(@ F i , R i , P i , BL i , BBs i , Bm i , ret i ) Bm i ( cur )[ pc i ]= res = load ty , ty ∗ ptr i p i = Eval ty i ∗ E ( s ,π i ( ptr i )) ∃ s j =(( F j , prev j , cur j , pc j ,π j , Free j ) , SL j ) F j =(@ F j , R j , P j , BL j , BBs j , Bm j , ret j ) Bm j ( cur )[ pc j ]= store ty j val , ty j ∗ ptr j p j = Eval ty j ∗ E ( s ,π i ( ptr i )) block ( p i )= block ( p j ) { offset ( p i ) ,... offset ( p i )+ BSize ( ty i ) }∩{ offset ( p j ) ,... offset ( p j )+ BSize ( ty i j ) }6 = ∅ ⊥ otherwiseFigure 14: Evaluation of propositions in Lodin where A , A ∈ Ψ( Register ) ∪ Ψ( N umber ) , ⊲⊳ ∈ Ψ( OP ), n = ( s , s , . . . , s n , s , M ) and s =(( M , Used ),N,F) . For

OverFlows we have only shownthe rule for overﬂows at writes, but naturally thereis an equivalent rule for reads.20

Lodin example.ll example2.qLodin 0.3 (Jul 8 2019)Revision : 0.2 -802 - ga42644cfImportance Ratio: doubleLLVM: 8.0.0LLVM module modifications:Remove Unuused instructionsWarning : No entry -point specified. Assumingmain.Random seed: 1562587068System : NaiveGraph -explicitPlatform : PThreadStorage : SharedMem StorageSuccessor: StandardProb - Successor: StandardPassed -Waiting : StandardSMT -Backend : Boolector 3.0.0Verifying: E< >((0 .main.b ==) )Warning : Casting register main.b to integertype UI8 - can ’t guarantee LLVM uses thisregister as suchNot Satisfied

Lodin -Output 1: Output from

Lodin . checks if Lodin implements the phi instructions be-haviour correctly.In

Lodin we can check the property by asking thequery E <> ( @ . main . % x ; ui32 == @ . main . % z ; ui32 ) .Unfortunately Lodin reports that this is indeedpossible even though it should not be. There is alogical explanation for this: both registers are ini-tialised by

Lodin to thus in the initial state theyare equal. For this reason, it is more reasonable touse the % b register for our check thus we check thequery E <> ( @ . main . % b ; ui8 == 1; ui8 ) and get the re-sult in Lodin -Output 1 indicating it is indeed notpossible.

A well-known problem for explicit-state reachabil-ity checking of parallel systems is the notorious statespace explosion problem i.e. that the combined statespace increases exponentially when each process ofthe system increases linearly. This is a huge problemwhen considering high-level programs and exacter-bated when using

LLVM as input, because

LLVM programs has more instructions per process. Formaking explicit-state reachability checking possiblewe thus need ways of limiting the size of the statespace. A ﬁrst realisation to reduce the state spaceis, that processes can only inﬂuence each others be-haviour at predeﬁned points, namely when accessingmemory. Due to our speciﬁcation language allowingto query whether functions can be called, we alsoconsider call instructions to aﬀect the external be-haviour of a process. We say that an instruction

Inst is internal if

Inst if it is a load , store or call instruc-tion. We denote the set of all internal instructionsby Internal ( R ). In the following we describe thetwo state space reductions that are implemented in-side Lodin . They both deﬁne a new transition re-lation, that can directly replace −→ E . e Our ﬁrst state space reduction is based on theidea, that when a process performs a transi-tion step it will perform all following transitionsthat executes internal instructions. More for-mally, we replace the transitions relation −→ E with −→ e where −→ e is deﬁned according tothe rule (cid:20) n k − Inst k −−−→ i E n k (cid:21) k =1 ...n n Inst ... Inst n −−−−−−−−→ i e n n , ∀ k> , Inst k ∈ Internal ( R ) . Notice, that there is no lower length in thensize of the sequence

Inst , · · · Inst n . To achievethe largest reduction, Lodin always uses thelongest possible sequence. I In this state space reduction, all processes thatperform internal instructions execute simulta-neously while all other processes execute inde-pendently. The transition relation −→ I is de-ﬁned by two rules (cid:20) n k − Inst k −−−→ i k E n k (cid:21) k =1 ...n n Inst ... Inst n −−−−−−−−→ i ,...,i n I n n , ∀ k, Inst k ∈ Internal ( R ) b cd f hi j 8 l (a) E ad hi l (b) e ab cd f hi j 8 l (c) I Figure 15:

Lodin state space reductions. Transi-tions going left originates from one process whiletransitions going to the right correspond to another.Dashed arrows indicate visible actions. n Inst −−−→ i E n ′ n Inst −−−→ i I n ′ , Inst / ∈ Internal ( R ) In Figure 15 we provide a graphical overview ofhow these reductions modiﬁes the state space.

Example 3.

As an example of the state space re-ductions that I and e respectively do, considerthe C-program in Figure 16 that executes petersonsmutual exclusion algorithm. To use this programwith Lodin , it must ﬁrst be compiled to an .ll -ﬁleusing clang . After this step we can inspect thestate space reductions achieved by asking Lodin thequery

EnumStates on the resulting .ll -ﬁle with the clang -S -c -emit-llvm file.c State Generator States DataRace States E e I diﬀerent state space reductions. In Table 4 we seethe reported number of states, along with how manystates with data races that was encountered. Noticethat I in this case achieves the largest reduction. Although the above state space reductions canreduce the state space due to interleavings dra-matically, they cannot reduce the number of statescaused by non-deterministic input. A program withjust one non-deterministic 32bit value will end uphaving over 2 states. In the preceding section we saw how

Lodin can beused to perform an exhaustive state space searchunder an explicit context. We also realised, thatthe state space explosion problem poses a problemfor any exhaustive search and showed how

Lodin can reduce this explosion through state space re-ductions. The state space reductinos also havetheir limits thus we need other strategies for han-dling this explosion.

Lodin proposes to use asimulation-based technique, where random (step-bounded) traces are drawn from the program andinspected for satisfaction of the property at hand.At the heart of any simulation-based technique isan underlying simulation distribution. The simula-tion distribution may stem from actual knowledgeof how the system behaves, in which case simula-tions can be used to calculate actual probabilities ofthe system satisfying the property using statisticalmethods - hence the name statistical model check-ing [21]. In case the simulation distribution is “arbi-22 int flags [2] = {0 ,0}; int turn = 0; void crit () {} typedef struct { int *mflag; int *oflag; int* turn; }Options ; void* petersons1 () { Options opt; opt.mflag = &flags [0]; opt.oflag = &flags [1]; opt.turn = &turn; *( opt.mflag) = 1; *( opt.turn) = 1; while (*(opt.oflag ) && *( opt. turn) == 1) { // busy wait } // critical section crit (); // end of critical section *( opt.mflag) = 0; return 0; } void* petersons2 () { Options opt; opt.mflag = &flags [1]; opt.oflag = &flags [0]; opt.turn = &turn; *( opt.mflag) = 1; *( opt.turn) = 0; while (*(opt.oflag ) && *( opt. turn) == 0) { // busy wait } // critical section crit (); // end of critical section *( opt.mflag) = 0; return 0; } Figure 16: Petersons Mutual Exclusion Protocoltrary”, then estimated probabilities are meaninglessfor the system itself, but serves as a way to predicthow likely it is that a continued search will ﬁnd theproperty searched for. In this case the technique iscalled Monte Carlo Model Checking. In Lodin each state n of the state space L EM =( N , n , −→ E ) is assigned a probability distribution γ n : N → [0 , γ n should obviously only assigna probability mass to a process if that process canperform a transition thus we require that γ n ( i ) =0 = ⇒ n Inst −−−→ i E n ′ , for some n ′ . Having selectedwho should perform an action, we also need a prob-ability function for the result of that choice i . We dothis by assuming a δ n ,i : N → [0 , N is theset of all states. The requirement to this functionis, that it should only assign probabilities to statesthat can be reached by the ith process performing a transition from n i.e. δ n ,i ( n ′ ) = 0 = ⇒ n Inst −−−→ i E n for some instruction Inst .Given these two probability mass functions, theprobability that a system generates the ﬁnite tran-sition sequence ω = n Inst −−−→ i E n Inst −−−→ i E . . . Inst n −−−→ i n E n n , where n is the initial state, is given by P ( ω ) = Q nk =1 γ n k − ( i k ) · δ n k − ,i k ( n k ). For a transitions se-quence ω = n Inst −−−→ i E n Inst −−−→ i E . . . Inst n −−−→ i n E n n , welet | ω | = n be its length and ω [ i ] = n i . We also letΩ m, M be the set of all transition sequences ω with | ω | = m of LLVM module M . Let p be a propo-sition, and ω ∈ Ω m, M then we deﬁne the indicatorfunction I p ( ω ) = ( ∃ i s.t. P p ( ω [ i ]) = tt ω at some point satisﬁes p and 0otherwise. With this at our hand, we deﬁne theprobability that an execution trace of a program M satisﬁes a proposition p within m steps as23 ata: Initial state: n Data:

Length: nω = n ; for i ∈ { , . . . , n } do k ∼ δ n i − ; n i ∼ γ n i − ,k ( n i − ); ω = ω n i ; endreturn ω Algorithm 2:

Generating random traces in

Lodin n States DataRace States1 77 1100 1840 41000 3579 1110000 4714 14Table 5: State encountered with SMC. The usedquery is

EnumStatesSMC <=5000 n . Pr M ,m ( p ) = X ω ∈ Ω m, M I p ( ω ) · P ( ω )As the probability only depends on the state, weusually project out transitions and only generate thestates. An algorithms for generating a sequence ofstates from n according to the probability distribu-tion can be seen in Algorithm 2. In the algorithm weuse k ∼ P to mean that k is distributed accordingto the probability mass function P . Example 4.

Before dwelling upon how to usingsimulation to do veriﬁcation, let us brieﬂy considerwhat kind of coverage of the state space we can ex-pect with by doing simulations. To this end, wehave implemented the query

EnumStatesSMC <=l n .This query simply generates n traces each of length l and keeps tracks of how many diﬀerent states ithas visited in total. We show the results of runningthis query on Figure 16 in Table 5. Recall from pre-viously, that the total number of states is . Statistical model checking tries answering two ques-tions: 1. a quantitative “What is the probability θ of reaching p ”?, and 2. a qualitative “Is the prob-ability of θ greater than θ t ”? Both questions areanswered by generating a number samples and us-ing statistical techniques to infer the answer with auser speciﬁed conﬁdence. Quantitative

Here we repeatedly generate runsand construct an interval [ θ l , θ u ] for which we areconﬁdent that the probabiltity θ is contained within.For the following we assume we are provided with ǫ being the wanted width of the interval and an α ∈ [0 ,

1] indicating the conﬁdence (1 − α ) we want inthe interval.Consider that we have generated a sequence ofsamples ω , ω , . . . , and let x , . . . , x m be randomvariables such that x i = I p ( ω i ). Then each variable x i has a Bernoulli distribution with success proba-bility θ t and the sum X m = P mi =1 x i is binomiallydistributed. We construct a conﬁdence interval us-ing the exact conﬁdence interval by Clopper andPearson [10]: if we have m samples then a Clopper-Pearson-interval with conﬁdence α is given as theintersection S ≤ ∩ S ≥ where S ≤ = { ψ | B m,ψ ( X m ) > α/ } S ≥ = { ψ | − B m,ψ ( X m ) > α/ } and B m,ψ is the cumulative distribution function fora binomial distribution with m samples and successparameter ψ . Notice that we are not in control ofthe resulting width of this interval - more sampleswill however shrink the width ǫ and thus we simplyiteratively produce samples until we get the desiredwidth. Example 5.

Let us consider the program inFigure 16 again and let us asses the probability thata data race is encountered. We can asses thiswith the query:

Pr[<=5000] (<> DataRace) . The in this query is the length of the runs. See

Lodin -Output 2 for the output. From the outputwe can see that

Lodin estimates the probability to odin 0.3 (Jul 8 2019)Revision : 0.2 -802 - ga42644cfImportance Ratio: doubleLLVM: 8.0.0LLVM module modifications:Remove Unuused instructionsWarning : Function signature of entry pointpetersons1 (Pointer ()) does not match onreturn type by platform (UI32 ())Warning : Function signature of entry pointpetersons2 (Pointer ()) does not match onreturn type by platform (UI32 ())Random seed: 1562589004System : NaiveGraph -explicitPlatform : PThreadStorage : SharedMem StorageSuccessor: StandardProb - Successor: StandardPassed -Waiting : StandardSMT -Backend : Boolector 3.0.0Verifying: Pr [ <=5000]( < >DataRace )Result : [0 .285738 ,0 .295738 ] with confidence 0.95Total Runs: 31883 , Satisfying Runs: 9269Histogram: Satisfying RunsMax Frequency: 0.504262Values in [28, 103 ] in steps of 1[ 4674 , 2057 , 0, 0, 0, 0, 0, 0, 0, 480, 0, 0,0, 0, 0, 0, 0, 615, 0, 0, 0, 0, 84, 0, 0,0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 1167 , 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 152, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 2 ] Lodin -Output 2:

Lodin -output lie in the interval [0 . , . . The last part pro-vides a histogram over the length of the satisyﬁngruns. Lodin runs by default with α = 0 . and δ = 0 . . These parameters can be tweaked by suf-ﬁxing the query with { Alpha = Float, Epsilon =Float } where Float are numbers in [0 , . Runningthe query Pr[<=5000] (<> DataRace) { Alpha =0.01, Epsilon = 0.05 } for instance gives the re-sult [0 . , . . Qualitative.

Checking whether the probability Pr M ,m ( p ) exceeds a threshold θ can be answeredby doing hypothesis testing. We test the hypothesis H : Pr M ,m ( p ) ≥ θ against H : Pr M ,m ( p ) < θ .In advance, we want to deﬁne two parameters, α (signiﬁcance level) and β (power level), that sig-niﬁes how willing we are to reject a true hypoth-esis and how willing we are to accept a false hy-pothesis. In practice we want a test for which theprobability of rejecting H while H is true is lessthan α ; while the probability of accepting H while H is true is less than β . Realising that acheivingboth of these requirements is close to impossible ingeneral [22] we introduce an indiﬀerence region ofwidth 2 · δ around θ and test instead the hypothesis H ′ Pr M ( p ) : φ ≥ θ + δ against H ′ : Pr M ( p ) < θ − δ .Wald [20] developed a sequential hypothesis testingalgorithm, see Algorithm 3, for exactly this case;the idea is to iteratively generate runs and basedon these calculate a value r - eventually this valuewill cross log ( β/ (1 − α )) or log ((1 − β ) /α ) and H ′ o is either rejected or accepted. In previous sections we described the symbolic rep-resentation of states used within

Lodin , and we sawin an example how this representation could be usedto explore many values registers simultaneously. Wehowever did not give a structured way of using thissymbolic representation in a veriﬁcation framework.We make up for that in this section.25 ata:

Initial State: s Data:

Property: Pr M ,m ( p ) ≥ θ Data:

Indiﬀerence Region: 2 · δ Data:

Signiﬁcance Level: α Data:

Power Level: β Result: ⊤ or ⊥ p = θ + δ ; p = θ − δ ; r = 0; while d > δ do ω = generateRun ( s, m ); x = I p ( ω ); r = r + x · log ( P /p ) + (1 − x ) · log ((1 − p / (1 − p )); if r ≤ log ( β/ (1 − α )) thenreturn ⊤ endif r ≥ log ((1 − β ) /α ) thenreturn ⊥ endendAlgorithm 3: Testing whether probability islarger than θ In this section we show how

Lodin uses its symbolicrepresentation to analyse single-threaded programswithout loops. For now, we will also restrict ourattention to verify if a given function can be called atany time e.g. propositions as [0 . @ error ]. Before goinginto details about the algorithm, we will setup upsome convenient notations, to make the algorithmmore readable.A key concept we will need in the algorithmfor analysing loop-free programs is converging ba-sic blocks and diverging basic blocks: for a LLVM function ( @ N , R , P , BL , BBs , Bm , ret ), we say that ablock B ∈ BBs diverges control ﬂow if B [ | B | ] def =( br i c , label % trueb , label % falseb ). For a block B ∈ BBs where Bm ( con ) = B for some con , we deﬁne theset of all blocks jumps to B as Data:

Property : φ Data:

Initial state: n Result: ⊤ or ⊥ Mergees := Mergees ; Waiting := { s } ; while Waiting = ∅ do Let n c ∈ Waiting ; Waiting := Waiting \ { n c } ; if P [ i . @ func ] ( n c ) thenreturn ⊤ endforeach n n ∈ { n | ∃ i, Inst s.t. n c Inst −−−→ i S n } doif ¬ Mergeable ( n n ) then Waiting = Waiting ∪ { n n } ; endelse Let n n = (( F , prev , cur , pc , π, Free ) : S, s S ) ; if ∃ ( cur , n o , n ) ∈ Mergees thenif n − then Waiting = Waiting ∪ { merge ( n o , n n ) } ; endelse Mergees = Mergees \ { ( cur , n o , n )) } ∪{ ( cur , merge ( n o , n n ) , n − } endendelse Mergees = Mergees ∪ { ( cur , n n , In ( n n ) − } ; endendendendreturn ⊥ Algorithm 4:

The symbolic reachability algo-rithm.26 n ( con ) = { B ′ ∈ BBs | B ′ [ | B ′ | ] def = ( br i r , label % con , label % f ) }∪{ B ′ ∈ BBs | B ′ [ | B ′ | ] def = ( br i r , label % t , label % con ) }∪{ B ′ ∈ BBs | B ′ [ | B ′ | ] def = ( br label % con ) } , and say that con labels a converging block if | In ( con ) | >

1. For ease of writing we will say that con is a converging block. The deﬁnition of In welift to states of L SM = ( N , n S , −→ S ) as follows: if n = ( s : S, s S ) and s = ( F , prev , cur , pc , π, Free )then In ( n ) = In ( cur ).In the discussion of the symbolic context, we de-ﬁned how to merge symbolic context states. Herewe wish to lift merging to a state n , n ′ ∈ N . Astate n = (( F , prev , cur , pc , π, Free ) : S, s S ) is consid-ered mergeable (written Mergeable ( n )) if pc is nota phi i32 instruction and In ( n ) >

1. It can be mergedwith another state n ′ = (( F , prev ′ , cur , pc , π, Free ′ ) : S, s ′S ) if s S and s ′S can be merged. The merge of n , n ′ is deﬁned as: merge ( n , n ′ ) =(( F , prev , cur , pc , π, Free ∪ {

Free ′ } ) : S, merge ( s S , s ′S ))After these preliminary setups, we are ready toshow the algorithm in Algorithm 4. To a large ex-tend it is the classic reachability algorithm whereunexplored states are kept in a Waiting list, and im-mediately after being pulled from the

Waiting , is ischecked if the property at hand is satisﬁed. Check-ing if the property [ i. @ func ] is true involves1. checking if the function @ func is being called bythe ith process (a check that does not dependon the LLVM registers),2. checking if the path formula of the state is sat-isﬁable.If the property is not satisﬁed, then all pos-sible successor are generated and either put into

Waiting (if not a

Mergeable state) or it is triedmerged with a state already in a

Mergees queue.

Handling Loops

Any nontrivial program willhave loops, and as such veriﬁcaion techniques mustcope with loops.

Lodin can verify programs withloops, but relies on syntactially unrolling the loopsbefore veriﬁcation. In case the loop unroll is com-plete, then the veriﬁcation is complete - otherwisethe veriﬁcation is only sound.

Lodin - available at - is buildaround the

LLVM -bitcode and uses the

LLVM -libraries for parsing the input-ﬁles, and perform-ing some

LLVM modiﬁcations during.

Lodin does,however, not use the infrastructure of

LLVM forperforming analyses. Instead it builds its own inter-nal representation of the loaded

LLVM module andimplements its own state space successor generator.

At load time

Lodin can perform a number of modi-ﬁcations of the

LLVM program - some of the modiﬁ-cations are enabled by default, some forced enabledby others . In the following we brieﬂy discuss themodiﬁcations. Naming Instructions

LLVM -bitcode ﬁles donot necessarily contain names for the registers. Atload time

Lodin therefore give names to all non-named registers in the program. This simpliﬁes in-ternally when providing error messages.

Constant Removal

LLVM -bitcode instructionscan have constant expressions which the interpreterof

Lodin would have to evaluate at run time. Wereplace these constant expressions with

LLVM in-structions thus simplifying the subset of

LLVM thatour interpreter needs to understand.

Simplify CFG

This is a standard

LLVM mod-iﬁcation that attempts to simplify the control ﬂow To help the user, the modiﬁed program can be outputtedat load time as well

Lodin provides an option for running thissimpliﬁcation, but does not run it by default as itmodiﬁes the program drastically and thus speciﬁca-tions of the user is perhaps no longer “valid”. Themodiﬁcation can be enabled by the user or forced byother modiﬁcations.

Elimninate Dead Code

As the names suggests,this modiﬁcation removes code that statically canbe determined to be unreachable. This is standard

LLVM modiﬁcation that has to be enabled by theuser.

Constant Propagation

This is a standard

LLVM modiﬁcation that forwards constants in the

LLVM -code and thereby reduce the number of in-structions in the

LLVM -code.

Mem2Reg

This modiﬁcation tries to promotememory operations to register operations. This isuseful as it makes operations easier for some of themodiﬁcations. The modiﬁcation can be enabled bythe user or forced by other modiﬁcations.

Loop Unrolling

This is the only modiﬁcationthat requires a user speciﬁed input n . The mod-iﬁcation unrolls all detected loops in the programat most n times. If it can be determined a loopwill only execute m < n times, it is of course onlyunrolled m times. The unrolling is implemented in-side Lodin but borrows the unrolling strategy fromthe

LLVM library. The reason the loop unrollingdoes not use the default

LLVM unrolling method isthat

Lodin needs more control of the unrolling thanthe interface oﬀered. Enabling loop unrolling force-enables

Mem2Reg and

Simplify CFG . The main us-age of Loop unrolling is to support the unrollingneeded by bounded model checking.

Lodin employs a layered architecture (seeFigure 17) where high-level algororithms - asdetailed in previous sections - can be implementedwithout knowledge of low-level consideratins such

Generators Prob-GeneratorsInterpreter PlatformsContext MemoryAlgorithmsState RepStorage

Figure 17: Architecture of

Lodin as how the states are represented. The algorithmsdepends on state generators implementing thethe state space reductions or the probabilisticsemantics. The generators in turns depends on ajoint interpreter-platform unit, that will interactwith an interface to a state representation (howactivation records are stored etc.). The state repre-sentation then depends on a context-memory unitwhich performs the operations requested by theinterpreter. At the lowest level of the architectureis the storage unit which is responsible for storingand saving states (used by the implementation of

Passed / Waiting sets in Algorithm 4)..

SMT Solvers

Lodin uses external SMT-solversfor solving the contraints gathered by the symbo-lis context implementation. The constraints arerepresented in a solver-independent format andonly at the last minute converted to SMT-solverspeciﬁcs. This allows easily interchanging the usedsolver: currently

Lodin is linked against Z3 [11] and Boolector [19] and uses

Boolector by default.28

Conclusion

We presented the fairly new tool

Lodin . Lodin implements explicit-state model checking of

LLVM with concurrent processes. To combat thestate-space explosion problem

Lodin supplementsexplicit-state model checking techniques withsimulation-based techniques. For single-threadedprograms

Lodin implements a symboic state spacerepresentation allowing it to verify programs withnon-deterministic input precisely. The symbolicenigne of

Lodin uses oﬀ-the-shelf SMT-solvers -presently

Boolector and Z3 . References [1] Christel Baier and Joost-Pieter Katoen.

Princi-ples of Model Checking . MIT Press, 2008. ISBN978-0-262-02649-9.[2] Thomas Ball and Sriram K. Rajamani. TheSLAM toolkit. In G´erard Berry, HubertComon, and Alain Finkel, editors,

Com-puter Aided Veriﬁcation, 13th InternationalConference, CAV 2001, Paris, France, July18-22, 2001, Proceedings , volume 2102 of

Lecture Notes in Computer Science , pages260–264. Springer, 2001. ISBN 3-540-42345-1. doi: 10.1007/3-540-44585-4 \

25. URL https://doi.org/10.1007/3-540-44585-4_25 .[3] Zuzana Baranov, Ji Barnat, Katarna Kejstov,Tade Kuera, Henrich Lauko, Jan Mrzek, PetrRokai, and Vladimr till. Model checking of Cand C++ with DIVINE 4. In

Automated Tech-nology for Veriﬁcation and Analysis (ATVA2017) , volume 10482 of

LNCS , pages 201–207.Springer, 2017.[4] Dirk Beyer and M. Erkan Keremoglu.Cpachecker: A tool for conﬁgurable soft-ware veriﬁcation. In Ganesh Gopalakrishnanand Shaz Qadeer, editors,

Computer AidedVeriﬁcation - 23rd International Conference,CAV 2011, Snowbird, UT, USA, July 14-20,2011. Proceedings , volume 6806 of

Lecture Notes in Computer Science , pages 184–190.Springer, 2011. ISBN 978-3-642-22109-5. doi:10.1007/978-3-642-22110-1 16.[5] Dirk Beyer, Thomas A. Henzinger, RanjitJhala, and Rupak Majumdar. The softwaremodel checker blast.

STTT , 9(5-6):505–525,2007. doi: 10.1007/s10009-007-0044-z. URL https://doi.org/10.1007/s10009-007-0044-z .[6] Armin Biere, Alessandro Cimatti, Ed-mund M. Clarke, Ofer Strichman, andYunshan Zhu. Bounded model checking.

Advances in Computers , 58:117–148, 2003.doi: 10.1016/S0065-2458(03)58003-2. URL https://doi.org/10.1016/S0065-2458(03)58003-2 .[7] Cristian Cadar, Daniel Dunbar, and Dawson R.Engler. KLEE: unassisted and automatic gen-eration of high-coverage tests for complexsystems programs. In Richard Draves andRobbert van Renesse, editors, , pages 209–224. USENIX Associ-ation, 2008. ISBN 978-1-931971-65-2. URL .[8] Edmund Clarke, Orna Grumberg, and DoronPeled.

Model Checking . MIT Press, 1999.[9] Edmund M. Clarke, Orna Grumberg,Somesh Jha, Yuan Lu, and HelmutVeith. Counterexample-guided abstrac-tion reﬁnement for symbolic model check-ing.

J. ACM , 50(5):752–794, 2003.doi: 10.1145/876638.876643. URL https://doi.org/10.1145/876638.876643 .[10] Charles J Clopper and Egon S Pearson. Theuse of conﬁdence or ﬁducial limits illustratedin the case of the binomial.

Biometrika , 26(4):404–413, 1934.[11] Leonardo Mendon¸ca de Moura and Niko-laj Bjørner. Z3: an eﬃcient SMT solver.In C. R. Ramakrishnan and Jakob Rehof,29ditors,

Tools and Algorithms for the Con-struction and Analysis of Systems, 14thInternational Conference, TACAS 2008, Heldas Part of the Joint European Conferenceson Theory and Practice of Software, ETAPS2008, Budapest, Hungary, March 29-April 6,2008. Proceedings , volume 4963 of

LectureNotes in Computer Science , pages 337–340.Springer, 2008. ISBN 978-3-540-78799-0.doi: 10.1007/978-3-540-78800-3 \

24. URL https://doi.org/10.1007/978-3-540-78800-3_24 .[12] LLVM Developers. LLVMlanguage reference manual. https://llvm.org/docs/LangRef.html ,2018.[13] Stephan Falke, Florian Merz, and Carsten Sinz.LLBMC: improved bounded model checking ofC programs using LLVM - (competition contri-bution). In Nir Piterman and Scott A. Smolka,editors,

Tools and Algorithms for the Construc-tion and Analysis of Systems - 19th Interna-tional Conference, TACAS 2013, Held as Partof the European Joint Conferences on Theoryand Practice of Software, ETAPS 2013, Rome,Italy, March 16-24, 2013. Proceedings , volume7795 of

Lecture Notes in Computer Science ,pages 623–626. Springer, 2013. ISBN 978-3-642-36741-0. doi: 10.1007/978-3-642-36742-748.[14] Patrice Godefroid. Verisoft: A tool for theautomatic analysis of concurrent reactivesoftware. In Orna Grumberg, editor,

Com-puter Aided Veriﬁcation, 9th InternationalConference, CAV ’97, Haifa, Israel, June22-25, 1997, Proceedings , volume 1254 of

Lecture Notes in Computer Science , pages476–479. Springer, 1997. ISBN 3-540-63166-6. doi: 10.1007/3-540-63166-6 \

52. URL https://doi.org/10.1007/3-540-63166-6_52 .[15] Arie Gurﬁnkel, Temesghen Kahsai, AnveshKomuravelli, and Jorge A. Navas. The seahornveriﬁcation framework. In Daniel Kroening andCorina S. Pasareanu, editors,

Computer Aided Veriﬁcation - 27th International Conference,CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I , volume 9206 of

Lecture Notes in Computer Science , pages 343–361. Springer, 2015. ISBN 978-3-319-21689-8.doi: 10.1007/978-3-319-21690-4 \

20. URL https://doi.org/10.1007/978-3-319-21690-4_20 .[16] Daniel Kroening and Michael Tautschnig.CBMC - C bounded model checker - (compe-tition contribution). In Erika ´Abrah´am andKlaus Havelund, editors,

Tools and Algorithmsfor the Construction and Analysis of Systems -20th International Conference, TACAS 2014,Held as Part of the European Joint Confer-ences on Theory and Practice of Software,ETAPS 2014, Grenoble, France, April 5-13,2014. Proceedings , volume 8413 of

LectureNotes in Computer Science , pages 389–391.Springer, 2014. ISBN 978-3-642-54861-1.doi: 10.1007/978-3-642-54862-8 \

26. URL https://doi.org/10.1007/978-3-642-54862-8_26 .[17] Chris Lattner and Vikram S. Adve. LLVM:A compilation framework for lifelong programanalysis & transformation. In , pages 75–88.IEEE Computer Society, 2004. ISBN 0-7695-2102-9. doi: 10.1109/CGO.2004.1281665. URL https://doi.org/10.1109/CGO.2004.1281665 .[18] Axel Legay, Dirk Nowotka, Danny BøgstedPoulsen, and Louis-Marie Traonouez. Sta-tistical model checking of LLVM code. InKlaus Havelund, Jan Peleska, Bill Roscoe,and Erik P. de Vink, editors,

Formal Methods- 22nd International Symposium, FM 2018,Held as Part of the Federated Logic Confer-ence, FloC 2018, Oxford, UK, July 15-17,2018, Proceedings , volume 10951 of

LectureNotes in Computer Science , pages 542–549.Springer, 2018. ISBN 978-3-319-95581-0.doi: 10.1007/978-3-319-95582-7 \

32. URL https://doi.org/10.1007/978-3-319-95582-7_32 .3019] Aina Niemetz, Mathias Preiner, CliﬀordWolf, and Armin Biere. Btor2 , btormcand boolector 3.0. In Hana Chockler andGeorg Weissenbacher, editors,

Computer AidedVeriﬁcation - 30th International Conference,CAV 2018, Held as Part of the Federated LogicConference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I , volume 10981 of

Lecture Notes in Computer Science , pages 587–595. Springer, 2018. ISBN 978-3-319-96144-6.doi: 10.1007/978-3-319-96145-3 \

32. URL https://doi.org/10.1007/978-3-319-96145-3_32 .[20] Abraham Wald.

Sequential analysis . CourierCorporation, 1973.[21] H˚akan L. S. Younes, Marta Z. Kwiatkowska,Gethin Norman, and David Parker. Numeri-cal vs. statistical probabilistic model checking.

STTT , 8(3):216–228, 2006.[22] Hkan L. S. Younes.

Related Researches

NOELLE Offers Empowering LLVM Extensions

by Angelo Matni

Operational Semantics with Hierarchical Abstract Syntax Graphs

by Dan R. Ghica

Compact Native Code Generation for Dynamic Languages on Micro-core Architectures

by Maurice Jamieson

Open-Source Verification with Chisel and Scala

by Andrew Dobis

LazyTensor: combining eager execution with domain-specific compilers

by Alex Suhan

Tensors Fitting Perfectly

by Adam Paszke

RbSyn: Type- and Effect-Guided Program Synthesis

by Sankha Narayan Guria

Learning to Make Compiler Optimizations More Effective

by Rahim Mammadli

AwkwardForth: accelerating Uproot with an internal DSL

by Jim Pivarski

Supermartingales, Ranking Functions and Probabilistic Lambda Calculus

by Andrew Kenyon-Roberts

SigVM: Toward Fully Autonomous Smart Contracts

by Ryan Song

Certifying Choreography Compilation

by Luís Cruz-Filipe

Zero-cost meta-programmed stateful functors in F

by Jonathan Protzenko

*Overcoming Restraint: Modular Refinement using Cogent's Principled Foreign Function Interface

by Louis Cheung

A Compiler Infrastructure for Accelerator Generators

by Rachit Nigam

Tail Modulo Cons

by Frédéric Bour

Effective Cache Apportioning for Performance Isolation Under Compiler Guidance

by Bodhisatwa Chatterjee

Interleaving classical and reversible

by Armando B. Matos

C11Tester: A Race Detector for C/C++ Atomics Technical Report

by Weiyu Luo

Interface Compliance of Inline Assembly: Automatically Check, Patch and Refine

by Frédéric Recoules

Metatheory.jl: Fast and Elegant Algebraic Computation in Julia with Extensible Equality Saturation

by Alessandro Cheli

Complete Bidirectional Typing for the Calculus of Inductive Constructions

by Meven Lennon-Bertrand

Strong Call by Value is Reasonable for Time

by Ma?gorzata Biernacka

Paradoxes of Probabilistic Programming

by Jules Jacobs

An Ownership Policy and Deadlock Detector for Promises

by Caleb Voss

«

1

2

3

4

»

Submitted on 4 Jun 2020 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar