RustHorn: CHC-based Verification for Rust Programs (full version)
RRustHorn: CHC-based Verification for RustPrograms (full version) (cid:63)
Yusuke Matsushita , Takeshi Tsukada , and Naoki Kobayashi The University of Tokyo, Tokyo, Japan { yskm24t,tsukada,koba } @is.s.u-tokyo.ac.jp Abstract.
Reduction to the satisfiability problem for constrained Hornclauses (CHCs) is a widely studied approach to automated program veri-fication. The current CHC-based methods for pointer-manipulating pro-grams, however, are not very scalable. This paper proposes a novel trans-lation of pointer-manipulating Rust programs into CHCs, which clearsaway pointers and memories by leveraging ownership. We formalize thetranslation for a simplified core of Rust and prove its correctness. Wehave implemented a prototype verifier for a subset of Rust and confirmedthe effectiveness of our method.
Reduction to constrained Horn clauses (CHCs) is a widely studied approach toautomated program verification [22,6]. A CHC is a Horn clause [30] equippedwith constraints, namely a formula of the form ϕ ⇐ = ψ ∧ · · · ∧ ψ k − , where ϕ and ψ , . . . , ψ k − are either an atomic formula of the form f ( t , . . . , t n − ) ( f isa predicate variable and t , . . . , t n − are terms), or a constraint (e.g. a < b + 1). We call a finite set of CHCs a
CHC system or sometimes just CHC.
CHC solving is an act of deciding whether a given CHC system S has a model , i.e. a valuationfor predicate variables that makes all the CHCs in S valid. A variety of programverification problems can be naturally reduced to CHC solving.For example, let us consider the following C code that defines McCarthy’s91 function. int mc91(int n) {if (n > 100) return n - 10; else return mc91(mc91(n + 11));} Suppose that we wish to prove mc91 ( n ) returns 91 whenever n ≤
101 (if it ter-minates). The wished property is equivalent to the satisfiability of the followingCHCs, where
Mc91 ( n, r ) means that mc91 ( n ) returns r if it terminates. Mc91 ( n, r ) ⇐ = n > ∧ r = n − (cid:63) This paper is the full version of [47]. Free variables are universally quantified. Terms and variables are governed undersorts (e.g. int , bool ), which are made explicit in the formalization of § a r X i v : . [ c s . P L ] J un Y. Matsushita et al.
Mc91 ( n, r ) ⇐ = n ≤ ∧ Mc91 ( n + 11 , r (cid:48) ) ∧ Mc91 ( r (cid:48) , r ) r = 91 ⇐ = n ≤ ∧ Mc91 ( n, r )The property can be verified because this CHC system has a model: Mc91 ( n, r ) : ⇐⇒ r = 91 ∨ ( n > ∧ r = n − . A CHC solver provides a common infrastructure for a variety of programminglanguages and properties to be verified. There have been effective CHC solvers[40,18,29,12] that can solve instances obtained from actual programs and manyprogram verification tools [23,37,25,28,38,60] use a CHC solver as a backend.However, the current CHC-based methods do not scale very well for programsusing pointers , as we see in § Rust-style ownership , as weexplain in § The standard CHC-based approach [23] for pointer-manipulating programs rep-resents the memory state as an array , which is passed around as an argumentof each predicate (cf. the store-passing style ), and a pointer as an index.For example, a pointer-manipulating variation of the previous program void mc91p(int n, int* r) {if (n > 100) *r = n - 10;else { int s; mc91p(n + 11, &s); mc91p(s, r); }} is translated into the following CHCs by the array-based approach: Mc91p ( n, r, h, h (cid:48) ) ⇐ = n > ∧ h (cid:48) = h { r ← n − } Mc91p ( n, r, h, h (cid:48) ) ⇐ = n ≤ ∧ Mc91p ( n + 11 , ms , h, h (cid:48)(cid:48) ) ∧ Mc91p ( h (cid:48)(cid:48) [ ms ] , r, h (cid:48)(cid:48) , h (cid:48) ) h (cid:48) [ r ] = 91 ⇐ = n ≤ ∧ Mc91p ( n, r, h, h (cid:48) ) . Mc91p additionally takes two arrays h , h (cid:48) representing the (heap) memory statesbefore/after the call of mc91p . The second argument r of Mc91p , which corre-sponds to the pointer argument r in the original program, is an index for thearrays. Hence, the assignment *r = n - 10 is modeled in the first CHC as anupdate of the r -th element of the array. ms represents the address of s . ThisCHC system has a model Mc91p ( n, r, h, h (cid:48) ) : ⇐⇒ h (cid:48) [ r ] = 91 ∨ ( n > ∧ h (cid:48) [ r ] = n − , which can be found by some array-supporting CHC solvers including Spacer [40],thanks to evolving SMT-solving techniques for arrays [62,10]. For example, the above CHC system on
Mc91 can be solved instantly by many CHCsolvers including Spacer [40] and HoIce [12]. h { r ← v } is the array made from h by replacing the value at index r with v . h [ r ] isthe value of array h at index r .ustHorn: CHC-based Verification for Rust Programs (full version) 3 However, the array-based approach has some shortcomings. Let us consider,for example, the following innocent-looking code. bool just_rec(int* ma) {if (rand() >= 0) return true;int old_a = *ma; int b = rand(); just_rec(&b);return (old_a == *ma);} It can immediately return true ; or it recursively calls itself and checks if thetarget of ma remains unchanged through the recursive call. In effect this function does nothing on the allocated memory blocks, although it can possibly modifysome of the unused parts of the memory.Suppose we wish to verify that just_rec never returns false . The standardCHC-based verifier for C, SeaHorn [23], generates a CHC system like below: JustRec ( ma , h, h (cid:48) , r ) ⇐ = h (cid:48) = h ∧ r = true JustRec ( ma , h, h (cid:48) , r ) ⇐ = mb (cid:54) = ma ∧ h (cid:48)(cid:48) = h { mb ← b }∧ JustRec ( mb , h (cid:48)(cid:48) , h (cid:48) , ) ∧ r = ( h [ ma ] == h (cid:48) [ ma ]) r = true ⇐ = JustRec ( ma , h, h (cid:48) , r )Unfortunately the CHC system above is not satisfiable and thus SeaHorn issuesa false alarm. This is because, in this formulation, mb may not necessarily becompletely fresh; it is assumed to be different from the argument ma of thecurrent call, but may coincide with ma of some deep ancestor calls. The simplest remedy would be to explicitly specify the way of memory allo-cation. For example, one can represent the memory state as a pair of an array h and an index sp indicating the maximum index that has been allocated so far. JustRec + ( ma , h, sp , h (cid:48) , sp (cid:48) , r ) ⇐ = h (cid:48) = h ∧ sp (cid:48) = sp ∧ r = true JustRec + ( ma , h, sp , h (cid:48) , sp (cid:48) , r ) ⇐ = mb = sp (cid:48)(cid:48) = sp + 1 ∧ h (cid:48)(cid:48) = h { mb ← b } JustRec + ( mb , h (cid:48)(cid:48) , sp (cid:48)(cid:48) , h (cid:48) , sp (cid:48) , ) ∧ r = ( h [ ma ] == h (cid:48) [ ma ]) r = true ⇐ = JustRec + ( ma , h, sp , h (cid:48) , sp (cid:48) , r ) ∧ ma ≤ sp The resulting CHC system now has a model, but it involves quantifiers:
JustRec + ( ma , h, sp , h (cid:48) , sp (cid:48) , r ) : ⇐⇒ r = true ∧ ma ≤ sp ∧ sp ≤ sp (cid:48) ∧ ∀ i ≤ sp . h [ i ] = h (cid:48) [ i ]Finding quantified invariants is known to be difficult in general despite ac-tive studies on it [41,2,36,26,19] and most current array-supporting CHC solversgive up finding quantified invariants. In general, much more complex operationson pointers can naturally take place, which makes the universally quantified in-variants highly involved and hard to automatically find. To avoid complexity ofmodels, CHC-based verification tools [23,24,37] tackle pointers by pointer anal- rand() is a non-deterministic function that can return any integer value. == , != , > = , && denote binary operations that return boolean values. We omitted the allocation for old_a for simplicity. Precisely speaking, SeaHorn tends to even omit shallow address-freshness checks like mb (cid:54) = ma . Y. Matsushita et al. ysis [61,43]. Although it does have some effects, the current applicable scope ofpointer analysis is quite limited. This paper proposes a novel approach to CHC-based verification of pointer-manipulating programs, which makes use of ownership information to avoid anexplicit representation of the memory.
Rust-style Ownership.
Various styles of ownership/permission/capability havebeen introduced to control and reason about usage of pointers on programminglanguage design, program analysis and verification [13,31,8,9,7,64,63]. In whatfollows, we focus on the ownership in the style of the Rust programming language[46,55].Roughly speaking, the ownership system guarantees that, for each memorycell and at each point of program execution, either (i) only one alias has the update (write & read) permission to the cell, with any other alias having no permission to it, or (ii) some (or no) aliases have the read permission to the cell,with no alias having the update permission to it. In summary, when an aliascan read some data (with an update/read permission), any other alias cannotmodify the data .As a running example, let us consider the program below, which followsRust’s ownership discipline (it is written in the C style; the Rust version ispresented at Example 1): int* take_max(int* ma, int* mb) {if (*ma >= *mb) return ma; else return mb;}bool inc_max(int a, int b) {{ int* mc = take_max(&a, &b); // borrow a and b *mc += 1;} // end of borrow return (a != b);} Figure 1 illustrates which alias has the update permission to the contents of a and b during the execution of take_max(5,3) .A notable feature is borrow . In the running example, when the pointers &a and &b are taken for take_max , the update permissions of a and b are temporarilytransferred to the pointers. The original variables, a and b , lose the ability toaccess their contents until the end of borrow. The function take_max returns apointer having the update permission until the end of borrow, which justifies the update operation *mc += 1 . In this example, the end of borrow is at the end ofthe inner block of inc_max . At this point, the permissions are given back to theoriginal variables a and b , allowing to compute a != b . Note that mc can point ustHorn: CHC-based Verification for Rust Programs (full version) 5 call take_max return take_max end ofborrowing maamcmbb (i) (ii) (iii) (iv) Fig. 1.
Values and aliases of a and b in evaluating inc_max(5,3) . Each line showseach variable’s permission timeline: a solid line expresses the update permission and abullet shows a point when the borrowed permission is given back. For example, b hasthe update permission to its content during (i) and (iv), but not during (ii) and (iii)because the pointer mb , created at the call of take_max , borrows b until the end of (iii). to a and also to b and that this choice is determined dynamically . The values of a and b after the borrow depend on the behavior of the pointer mc .The end of each borrow is statically managed by a lifetime . See § Key Idea.
The key idea of our method is to represent a pointer ma as a pair (cid:104) a, a ◦ (cid:105) of the current target value a and the target value a ◦ at the end of borrow . Thisrepresentation employs access to the future information (it is related to prophecyvariables ; see § inc_max always return true ?”is reduced to the satisfiability of the following CHCs: TakeMax ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , r ) ⇐ = a ≥ b ∧ b ◦ = b ∧ r = (cid:104) a, a ◦ (cid:105) TakeMax ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , r ) ⇐ = a < b ∧ a ◦ = a ∧ r = (cid:104) b, b ◦ (cid:105) IncMax ( a, b, r ) ⇐ = TakeMax ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , (cid:104) c, c ◦ (cid:105) ) ∧ c (cid:48) = c + 1 ∧ c ◦ = c (cid:48) ∧ r = ( a ◦ != b ◦ ) r = true ⇐ = IncMax ( a, b, r ) . The mutable reference ma is now represented as (cid:104) a, a ◦ (cid:105) , and similarly for mb and mc . The first CHC models the then-clause of take_max : the return value is ma ,which is expressed as r = (cid:104) a, a ◦ (cid:105) ; in contrast, mb is released, which constrains b ◦ , the value of b at the end of borrow, to the current value b . In the clause on IncMax , mc is represented as a pair (cid:104) c, c ◦ (cid:105) . The constraint c (cid:48) = c + 1 ∧ c ◦ = c (cid:48) models the increment of mc (in the phase (iii) in Fig. 1). Importantly, the finalcheck a != b is simply expressed as a ◦ != b ◦ ; the updated values of a / b areavailable as a ◦ / b ◦ . Clearly, the CHC system above has a simple model. Precisely, this is the representation of a pointer with a borrowed update permission(i.e. mutable reference ). Other cases are discussed in § For example, in the case of Fig. 1, when take_max is called, the pointer ma is (cid:104) , (cid:105) and mb is (cid:104) , (cid:105) . Y. Matsushita et al. Also, the just_rec example in § JustRec ( (cid:104) a, a ◦ (cid:105) , r ) ⇐ = a ◦ = a ∧ r = true JustRec ( (cid:104) a, a ◦ (cid:105) , r ) ⇐ = mb = (cid:104) b, b ◦ (cid:105) ∧ JustRec ( mb , ) ∧ a ◦ = a ∧ r = ( a == a ◦ ) r = true ⇐ = JustRec ( (cid:104) a, a ◦ (cid:105) , r ) . Now it has a very simple model:
JustRec ( ma , r ) : ⇐⇒ r = true . Remarkably,arrays and quantified formulas are not required to express the model, whichallows the CHC system to be easily solved by many CHC solvers. More advancedexamples are presented in § Contributions.
Based on the above idea, we formalize the translation from pro-grams to CHC systems for a core language of Rust, prove correctness (bothsoundness and completeness) of the translation, and confirm the effectivenessof our approach through preliminary experiments. The core language supports,among others, recursive types. Remarkably, our approach enables us to automat-ically verify some properties of a program with destructive updates on recursivedata types such as lists and trees.The rest of the paper is structured as follows. In §
2, we provide a formalizedcore language of Rust supporting recursions, lifetime-based ownership and recur-sive types. In §
3, we formalize our translation from programs to CHCs and proveits correctness. In §
4, we report on the implementation and the experimentalresults. In § § We formalize a core of Rust as
Calculus of Ownership and Reference (COR) ,whose design has been affected by the safe layer of λ Rust in the RustBelt paper[32]. It is a typed procedural language with a Rust-like ownership system.
The following is the syntax of COR. (program) Π ::= F · · · F n − (function definition) F ::= fn f Σ { L : S · · · L n − : S n − } (function signature) Σ ::= (cid:104) α , . . . , α m − | α a ≤ α b , . . . , α a l − ≤ α b l − (cid:105) ( x : T , . . . , x n − : T n − ) → U (statement) S ::= I ; goto L | return x | match ∗ x { inj ∗ y → goto L , inj ∗ y → goto L } ustHorn: CHC-based Verification for Rust Programs (full version) 7(instruction) I ::= let y = mutbor α x | drop x | immut x | swap ( ∗ x, ∗ y ) | let ∗ y = x | let y = ∗ x | let ∗ y = copy ∗ x | x as T | let y = f (cid:104) α , . . . , α m − (cid:105) ( x , . . . , x n − ) | intro α | now α | α ≤ β | let ∗ y = const | let ∗ y = ∗ x op ∗ x (cid:48) | let ∗ y = rand () | let ∗ y = inj T + T i ∗ x | let ∗ y = ( ∗ x , ∗ x ) | let ( ∗ y , ∗ y ) = ∗ x (type) T, U ::= X | µX.T | P T | T + T | T × T | int | unit (pointer kind) P ::= own | R α (reference kind) R ::= mut | immut α, β, γ ::= (lifetime variable) X, Y ::= (type variable) x, y ::= (variable) f, g ::= (function name) L ::= (label) const ::= n | () bool := unit + unit op ::= op int | op bool op int ::= + | − | · · · op bool ::= > = | == | != | · · · Program, Function and Label.
A program (denoted by Π ) is a set of functiondefinitions. A function definition ( F ) consists of a function name, a functionsignature and a set of labeled statements ( L : S ). In COR, for simplicity, theinput/output types of a function are restricted to pointer types . A function isparametrized over lifetime parameters under constraints; polymorphism on typesis not supported for simplicity, just as λ Rust . For the lifetime parameter receiver,often (cid:104) α , · · · |(cid:105) is abbreviated to (cid:104) α , . . . (cid:105) and (cid:104)|(cid:105) is omitted.A label ( L ) is an abstract program point to be jumped to by goto . Eachlabel is assigned a whole context by the type system, as we see later. This style,with unstructured control flows, helps the formal description of CHCs in § entry (entry point), and every label in a functionshould be syntactically reachable from entry by goto jumps. Statement and Instruction.
A statement ( S ) performs an instruction with a jump( I ; goto L ), returns from a function ( return x ), or branches ( match ∗ x {· · ·} ).An instruction ( I ) performs an elementary operation: mutable (re)borrow( let y = mutbor α x ), releasing a variable ( drop x ), weakening ownership ( immut x ), swap ( swap ( ∗ x, ∗ y )), creating/dereferencing a pointer ( let ∗ y = x , let y = ∗ x ), copy ( let ∗ y = copy ∗ x ), type weakening ( x as T ), function call ( let y = f (cid:104)· · ·(cid:105) ( · · · )), lifetime-related ghost operations ( intro α, now α, α ≤ β ; explainedlater), getting a constant / operation result / random integer ( let ∗ y = const / ∗ x op ∗ x (cid:48) / rand ()), creating a variant ( let ∗ y = inj T + T i ∗ x ), and creating/destruct-ing a pair ( let ∗ y = ( ∗ x , ∗ x ) , let ( ∗ y , ∗ y ) = ∗ x ). An instruction of form let ∗ y = · · · implicitly allocates new memory cells as y ; also, some instruc-tions deallocate memory cells implicitly. For simplicity, every variable is de- It is related to a continuation introduced by letcont in λ Rust . Here ‘syntactically’ means that detailed information such that a branch conditionon match or non-termination is ignored. This instruction turns a mutable reference to an immutable reference. Using this, animmutable borrow from x to y can be expressed by let y = mutbor α x ; immut y . Copying a pointer (an immutable reference) x to y can be expressed by let ∗ ox = x ; let ∗ oy = copy ∗ ox ; let y = ∗ oy . Y. Matsushita et al. signed to be a pointer and every release of a variable should be explicitly an-notated by ‘ drop x ’. In addition, we provide swap instead of assignment; theusual assignment (of copyable data from ∗ x to ∗ y ) can be expressed by let ∗ x (cid:48) = copy ∗ x ; swap ( ∗ y, ∗ x (cid:48) ); drop x (cid:48) . Type.
As a type ( T ), we support recursive types ( µX.T ), pointer types ( P T ),variant types ( T + T ), pair types ( T × T ) and basic types ( int , unit ).A pointer type P T can be an owning pointer own T ( Box
COR can express most borrow patterns in thecore of Rust. The set of moments when a borrow is active forms a continuoustime range, even under non-lexical lifetimes [54]. A major limitation of COR is that it does not support unsafe code blocks andalso lacks type traits and closures . Still, our idea can be combined with unsafecode and closures, as discussed in § In Rust, even after a reference loses the permission and the lifetime ends, its addressdata can linger in the memory, although dereferencing on the reference is no longerallowed. We simplify the behavior of lifetimes in COR. In the terminology of Rust, a lifetime often means a time range where a borrow isactive. To simplify the discussions, however, we in this paper use the term lifetimeto refer to a time point when a borrow ends . Strictly speaking, this property is broken by recently adopted implicit two-phaseborrows [59,53]. However, by shallow syntactical reordering, a program with implicittwo-phase borrows can be fit into usual borrow patterns.ustHorn: CHC-based Verification for Rust Programs (full version) 9
Rust and λ Rust , we cannot directly modify/borrow a fragment of a variable (e.g.an element of a pair). Still, we can eventually modify/borrow a fragment byborrowing the whole variable and splitting pointers (e.g. ‘ let ( ∗ y , ∗ y ) = ∗ x ’).This borrow-and-split strategy, nevertheless, yields a subtle obstacle when weextend the calculus for advanced data types (e.g. get_default in ‘Problem Case Example 1 (COR Program).
The following program expresses the functions take_max and inc_max presented in § L ’ (e.g. L : I ; L I ; goto L stands for L : I ; goto L L : I ; goto L ). fn take-max (cid:104) α (cid:105) ( ma : mut α int , mb : mut α int ) → mut α int { entry : let ∗ ord = ∗ ma > = ∗ mb ; L1 match ∗ ord { inj ∗ ou → goto L2 , inj ∗ ou → goto L5 } L2 : drop ou ; L3 drop mb ; L4 return ma L5 : drop ou ; L6 drop ma ; L7 return mb } fn inc-max ( oa : own int , ob : own int ) → own bool { entry : intro α ; L1 let ma = mutbor α oa ; L2 let mb = mutbor α ob ; L3 let mc = take-max (cid:104) α (cid:105) ( ma, mb ); L4 let ∗ o1 = 1; L5 let ∗ oc (cid:48) = ∗ mc + ∗ o1 ; L6 drop o1 ; L7 swap ( mc , oc (cid:48) ); L8 drop oc (cid:48) ; L9 drop mc ; L10 now α ; L11 let ∗ or = ∗ oa != ∗ ob ; L12 drop oa ; L13 drop ob ; L14 return or } In take-max , conditional branching is performed by match and its goto directions(at L1 ). In inc-max , increment on the mutable reference mc is performed bycalculating the new value (at L4 , L5 ) and updating the data by swap (at L7 ).The following is the corresponding Rust program, with ghost annotations(marked italic and dark green, e.g. drop ma ) on lifetimes and releases of mutablereferences. fn take_max<'a>(ma: &'a mut i32, mb: &'a mut i32) -> &'a mut i32 {if *ma >= *mb { drop mb; ma } else { drop ma; mb }}fn inc_max(mut a: i32, mut b: i32) -> bool {{ intro 'a; let mc = take_max <'a> (& 'a mut a, & 'a mut b); *mc += 1; drop mc; now 'a; }a != b} The type system of COR assigns to each label a whole context ( Γ , A ). We definebelow the whole context and the typing judgments. The first character of each variable indicates the pointer kind ( o / m corresponds to own / mut α ). We swap the branches of the match statement in take-max , to fit theorder to C/Rust’s if .0 Y. Matsushita et al. Context. A variable context Γ is a finite set of items of form x : a T , where T should be a complete pointer type and a (which we call activeness ) is of form‘active’ or ‘ † α ’ ( frozen until lifetime α ). We abbreviate x : active T as x : T . Avariable context should not contain two items on the same variable. A lifetimecontext A = ( A, R ) is a finite preordered set of lifetime variables, where A is theunderlying set and R is the preorder. We write | A | and ≤ A to refer to A and R .Finally, a whole context ( Γ , A ) is a pair of a variable context Γ and a lifetimecontext A such that every lifetime variable in Γ is contained in A . Notations.
The set operation A + B (or more generally (cid:80) λ A λ ) denotes thedisjoint union, i.e. the union defined only if the arguments are disjoint. The setoperation A − B denotes the set difference defined only if A ⊇ B . For a naturalnumber n , [ n ] denotes the set { , . . . , n − } .Generally, an auxiliary definition for a rule can be presented just below,possibly in a dotted box. Program and Function.
The rules for typing programs and functions are pre-sented below. They assign to each label a whole context ( Γ , A ). ‘ S : Π,f ( Γ , A ) | ( Γ L , A L ) L | U ’ is explained later. for any F in Π, F : Π ( Γ name( F ) ,L , A name( F ) ,L ) L ∈ Label F Π : ( Γ f,L , A f,L ) ( f,L ) ∈ FnLabel Π name( F ): the function name of F Label F : the set of labels in F FnLabel Π : the set of pairs ( f, L ) such that a function f in Π has a label LF = fn f (cid:104) α , . . . , α m − | α a ≤ α b , . . . , α a l − ≤ α b l − (cid:105) ( x : T , . . . , x n − : T n − ) → U {· · ·} Γ entry = { x i : T i | i ∈ [ n ] } A = { α j | j ∈ [ m ] } A entry = (cid:0) A, (cid:0) Id A ∪{ ( α a k , α b k ) | k ∈ [ l ] } (cid:1) + (cid:1) for any L (cid:48) : S ∈ LabelStmt F , S : Π,f ( Γ L (cid:48) , A L (cid:48) ) | ( Γ L , A L ) L ∈ Label F | UF : Π ( Γ L , A L ) L ∈ Label F LabelStmt F : the set of labeled statements in F Id A : the identity relation on A R + : the transitive closure of R On the rule for the function, the initial whole context at entry is specified(the second and third preconditions) and also the contexts for other labels arechecked (the fourth precondition). The context for each label (in each function)can actually be determined in the order by the distance in the number of goto jumps from entry , but that order is not very obvious because of unstructuredcontrol flows . Statement. ‘ S : Π,f ( Γ , A ) | ( Γ L , A L ) L | U ’ means that running the statement S (under Π, f ) with the whole context ( Γ , A ) results in a jump to a label with thewhole contexts specified by ( Γ L , A L ) L or a return of data of type U . Its rulesare presented below. ‘ I : Π,f ( Γ , A ) → ( Γ (cid:48) , A (cid:48) )’ is explained later. I : Π,f ( Γ , A ) → ( Γ L , A L ) I ; goto L : Π,f ( Γ , A ) | ( Γ L , A L ) L | U Γ = { x : U } | A | = A ex Π,f return x : Π,f ( Γ , A ) | ( Γ L , A L ) L | UA ex Π,f : the set of lifetime parameters of f in Π ustHorn: CHC-based Verification for Rust Programs (full version) 11 x : P ( T + T ) ∈ Γ for i = 0 , , ( Γ L i , A L i ) = ( Γ −{ x : P ( T + T ) } + { y i : P T i } , A ) match ∗ x { inj ∗ y → goto L , inj ∗ y → goto L } : Π,f ( Γ , A ) | ( Γ L , A L ) L | U The rule for the return statement ensures that there remain no extra variablesand local lifetime variables.
Instruction. ‘ I : Π,f ( Γ , A ) → ( Γ (cid:48) , A (cid:48) )’ means that running the instruction I (un-der Π, f ) updates the whole context ( Γ , A ) into ( Γ (cid:48) , A (cid:48) ). The rules are designedso that, for any I , Π , f , ( Γ , A ), there exists at most one ( Γ (cid:48) , A (cid:48) ) such that I : Π,f ( Γ , A ) → ( Γ (cid:48) , A (cid:48) ) holds. Below we present some of the rules; the completerules are presented in Appendix A.1. The following is the typing rule for mutable(re)borrow. α / ∈ A ex Π,f P = own , mut β for any γ ∈ Lifetime
P T , α ≤ A γ let y = mutbor α x : Π,f ( Γ + { x : P T } , A ) → ( Γ + { y : mut α T, x : † α P T } , A )Lifetime T : the set of lifetime variables occurring in T After you mutably (re)borrow an owning pointer / mutable reference x until α , x is frozen until α . Here, α should be a local lifetime variable (the first precondi-tion) that does not live longer than the data of x (the third precondition). Beloware the typing rules for local lifetime variable introduction and elimination. intro α : Π,f (cid:0) Γ , ( A, R ) (cid:1) → (cid:0) Γ , ( { α } + A, { α }× ( { α } + A ex Π,f )+ R ) (cid:1) α / ∈ A ex Π,f now α : Π,f (cid:0) Γ , ( { α } + A, R ) (cid:1) → (cid:0) { thaw α ( x : a T ) | x : a T ∈ Γ } , ( A, { ( β, γ ) ∈ R | β (cid:54) = α } ) (cid:1) thaw α ( x : a T ) := (cid:26) x : T ( a = † α ) x : a T (otherwise) On intro α , it just ensures the new local lifetime variable to be earlier thanany lifetime parameters (which are given by exterior functions). On now α , thevariables frozen with α get active again. Below is the typing rule for dereferenceof a pointer to a pointer, which may be a bit interesting. let y = ∗ x : Π,f ( Γ + { x : P P (cid:48) T } , A ) → ( Γ + { y : ( P ◦ P (cid:48) ) T } , A ) P ◦ own = own ◦ P := P R α ◦ R (cid:48) β := R (cid:48)(cid:48) α where R (cid:48)(cid:48) = (cid:26) mut ( R = R (cid:48) = mut ) immut (otherwise) The third precondition of the typing rule for mutbor justifies taking just α inthe rule ‘ R α ◦ R (cid:48) β := R (cid:48)(cid:48) α ’.Let us interpret Π : ( Γ f,L , A f,L ) ( f,L ) ∈ FnLabel Π as “the program Π has thetype ( Γ f,L , A f,L ) ( f,L ) ∈ FnLabel Π ”. The type system ensures that any programhas at most one type (which may be a bit unclear because of unstructuredcontrol flows). Hereinafter, we implicitly assume that a program has a type. In COR, a reference that lives after the return from the function should be cre-ated by splitting a reference (e.g. ‘ let ( ∗ y , ∗ y ) = ∗ x ’) given in the inputs; see alsoExpressivity and Limitations.2 Y. Matsushita et al. We introduce for COR concrete operational semantics , which handles a concretemodel of the heap memory.The basic item, concrete configuration C , is defined as follows. S ::= end (cid:12)(cid:12) [ f, L ] x, F ; S (concrete configuration) C ::= [ f, L ] F ; S | H Here, H is a heap , which maps addresses (represented by integers) to integers(data). F is a concrete stack frame , which maps variables to addresses. The stackpart of C is of form ‘[ f, L ] F ; [ f (cid:48) , L (cid:48) ] x, F (cid:48) ; · · · ; end’ (we may omit the terminator‘; end’). [ f, L ] on each stack frame indicates the program point. ‘ x, ’ on each non-top stack frame is the receiver of the value returned by the function call.Concrete operational semantics is characterized by the one-step transitionrelation C → Π C (cid:48) and the termination relation final Π ( C ), which can be de-fined straightforwardly. Below we show the rules for mutable (re)borrow, swap,function call and return from a function; the complete rules and an exampleexecution are presented in Appendix A.2. S Π,f,L is the statement for the label L of the function f in Π . Ty Π,f,L ( x ) is the type of variable x at the label. S Π,f,L = let y = mutbor α x ; goto L (cid:48) F ( x ) = a [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, a ) } ; S | H S Π,f,L = swap ( ∗ x, ∗ y ); goto L (cid:48) Ty Π,f,L ( x ) = P T F ( x ) = a F ( y ) = b [ f, L ] F ; S | H + { ( a + k, m k ) | k ∈ [ T ] } + { ( b + k, n k ) | k ∈ [ T ] }→ Π [ f, L (cid:48) ] F ; S | H + { ( a + k, n k ) | k ∈ [ T ] } + { ( b + k, m k ) | k ∈ [ T ] } S Π,f,L = let y = g (cid:104)· · ·(cid:105) ( x , . . . , x n − ); goto L (cid:48) Σ Π,g = (cid:104)· · ·(cid:105) ( x (cid:48) : T , . . . , x (cid:48) n − : T n − ) → U [ f, L ] F + { ( x i , a i ) | i ∈ [ n ] } ; S | H → Π [ g, entry ] { ( x (cid:48) i , a i ) | i ∈ [ n ] } ; [ f, L ] y, F ; S | H S Π,f,L = return x [ f, L ] { ( x, a ) } ; [ g, L (cid:48) ] x (cid:48) , F (cid:48) ; S | H → Π [ g, L (cid:48) ] F (cid:48) + { ( x (cid:48) , a ) } ; S | H S Π,f,L = return x final Π (cid:0) [ f, L ] { ( x, a ) } | H (cid:1) Here we introduce ‘ T ’, which represents how many memory cells the type T takes (at the outermost level). T is defined for every complete type T , becauseevery occurrence of type variables in a complete type is guarded by a pointerconstructor. T + T ) := 1 + max { T , T } T × T ) := T + T µX.T := T [ µX.T /X ] int = P T := 1 unit = 0
To formalize the idea discussed in §
1, we give a translation from COR programsto CHC systems, which precisely characterize the input-output relations of theCOR programs. We first define the logic for CHCs ( § ustHorn: CHC-based Verification for Rust Programs (full version) 13 describe our translation ( § § § § To begin with, we introduce a first-order multi-sorted logic for describing theCHC representation of COR programs.
Syntax.
The syntax is defined as follows. (CHC) Φ ::= ∀ x : σ , . . . , x m − : σ m − . ˇ ϕ ⇐ = ψ ∧ · · · ∧ ψ n − (cid:62) := the nullary conjunction of formulas(formula) ϕ, ψ ::= f ( t , . . . , t n − ) (elementary formula) ˇ ϕ ::= f ( p , . . . , p n − )(term) t ::= x | (cid:104) t (cid:105) | (cid:104) t ∗ , t ◦ (cid:105) | inj i t | ( t , t ) | ∗ t | ◦ t | t.i | const | t op t (cid:48) (value) v, w ::= (cid:104) v (cid:105) | (cid:104) v ∗ , v ◦ (cid:105) | inj i v | ( v , v ) | const (pattern) p, q ::= x | (cid:104) p (cid:105) | (cid:104) p ∗ , p ◦ (cid:105) | inj i p | ( p , p ) | const (sort) σ, τ ::= X | µX.σ | C σ | σ + σ | σ × σ | int | unit (container kind) C ::= box | mut const ::= same as COR op ::= same as COR bool := unit + unit true := inj () false := inj () X ::= (sort variable) x, y ::= (variable) f ::= (predicate variable) We introduce box σ and mut σ , which correspond to own T / immut α T and mut α T respectively. (cid:104) t (cid:105) / (cid:104) t ∗ , t ◦ (cid:105) is the constructor for box σ / mut σ . ∗ t takes thebody/first value of (cid:104)−(cid:105) / (cid:104)− , −(cid:105) and ◦ t takes the second value of (cid:104)− , −(cid:105) . We restrictthe form of CHCs here to simplify the proofs later. Although the logic does nothave a primitive for equality, we can define the equality in a CHC system (e.g.by adding ∀ x : σ. Eq ( x, x ) ⇐ = (cid:62) ).A CHC system ( Φ , Ξ ) is a pair of a finite set of CHCs Φ = { Φ , . . . , Φ n − } and Ξ , where Ξ is a finite map from predicate variables to tuples of sorts (denotedby Ξ ), specifying the sorts of the input values. Unlike the informal descriptionin §
1, we add Ξ to a CHC system. Sort System. ‘ t : ∆ σ ’ (the term t has the sort σ under ∆ ) is defined as follows.Here, ∆ is a finite map from variables to sorts. σ ∼ τ is the congruence on sortsinduced by µX.σ ∼ σ [ µX.σ/X ]. ∆ ( x ) = σx : ∆ σ t : ∆ σ (cid:104) t (cid:105) : ∆ box σ t ∗ , t ◦ : ∆ σ (cid:104) t ∗ , t ◦ (cid:105) : ∆ mut σ t : ∆ σ i inj i t : ∆ σ + σ t : ∆ σ t : ∆ σ ( t , t ): ∆ σ × σ t : ∆ C σ ∗ t : ∆ σ t : ∆ mut σ ◦ t : ∆ σ t : ∆ σ + σ t.i : ∆ σ i const : ∆ σ const t, t (cid:48) : ∆ int t op t (cid:48) : ∆ σ op t : ∆ σ σ ∼ τt : ∆ τσ const : the sort of const σ op : the output sort of op ‘wellSorted ∆ , Ξ ( ϕ )’ and ‘wellSorted Ξ ( Φ )’, the judgments on well-sortednessof formulas and CHCs, are defined as follows. Ξ ( f ) = ( σ , . . . , σ n − ) for any i ∈ [ n ] , t i : ∆ σ i wellSorted ∆ , Ξ ( f ( t , . . . , t n − ))4 Y. Matsushita et al. ∆ = { ( x i , σ i ) | i ∈ [ m ] } wellSorted ∆ , Ξ ( ˇ ϕ ) for any j ∈ [ n ] , wellSorted ∆ , Ξ ( ψ j )wellSorted Ξ (cid:0) ∀ x : σ , . . . , x m − : σ m − . ˇ ϕ ⇐ = ψ ∧ · · · ∧ ψ n − (cid:1) The CHC system ( Φ , Ξ ) is said to be well-sorted if wellSorted Ξ ( Φ ) holds for any Φ ∈ Φ . Semantics. ‘[[ t ]] I ’, the interpretation of the term t as a value under I , is definedas follows. Here, I is a finite map from variables to values. Although the definitionis partial, the interpretation is defined for all well-sorted terms. [[ x ]] I := I ( x ) [[ (cid:104) t (cid:105) ]] I := (cid:104) [[ t ]] I (cid:105) [[ (cid:104) t ∗ , t ◦ (cid:105) ]] I := (cid:104) [[ t ∗ ]] I , [[ t ◦ ]] I (cid:105) [[ inj i t ]] I := inj i [[ t ]] I [[( t , t )]] I := ([[ t ]] I , [[ t ]] I ) [[ ∗ t ]] I := (cid:26) v ([[ t ]] I = (cid:104) v (cid:105) ) v ∗ ([[ t ]] I = (cid:104) v ∗ , v ◦ (cid:105) ) [[ ◦ t ]] I := v ◦ if [[ t ]] I = (cid:104) v ∗ , v ◦ (cid:105) [[ t.i ]] I := v i if [[ t ]] I = ( v , v ) [[ const ]] I := const [[ t op t (cid:48) ]] I := [[ t ]] I [[ op ]][[ t (cid:48) ]] I [[ op ]]: the binary operation on values corresponding to op A predicate structure M is a finite map from predicate variables to (concrete)predicates on values. M , I | = f ( t , . . . , t n − ) means that M ( f )([[ t ]] I , . . . , [[ t n − ]] I )holds. M | = Φ is defined as follows. for any I s.t. ∀ i ∈ [ m ] . I ( x i ): ∅ σ i , M , I | = ψ , . . . , ψ n − implies M , I | = ˇ ϕ M | = ∀ x : σ , . . . , x m − : σ m − . ˇ ϕ ⇐ = ψ ∧ · · · ∧ ψ n − Finally, M | = ( Φ , Ξ ) is defined as follows. for any ( f, ( σ , . . . , σ n − )) ∈ Ξ , M ( f ) is a predicate on values of sort σ , . . . , σ n − dom M = dom Ξ for any Φ ∈ Φ , M | = Φ M | = ( Φ , Ξ ) When M | = ( Φ , Ξ ) holds, we say that M is a model of ( Φ , Ξ ). Every well-sorted CHC system ( Φ , Ξ ) has the least model on the point-wise ordering (whichcan be proved based on the discussions in [16]), which we write as M least( Φ , Ξ ) . Now we formalize our translation of Rust programs into CHCs. We define ( | Π | ),which is a CHC system that represents the input-output relations of the functionsin the COR program Π .Roughly speaking, the least model M least( | Π | ) for this CHC system should sat-isfy: for any values v , . . . , v n − , w , M least( | Π | ) | = f entry ( v , . . . , v n − , w ) holds exactlyif, in COR, a function call f ( v , . . . , v n − ) can return w . Actually, in concreteoperational semantics, such values should be read out from the heap memory.The formal description and proof of this expected property is presented in § Auxiliary Definitions.
The sort corresponding to the type T , ( | T | ), is definedas follows. ˇ P is a meta-variable for a non-mutable-reference pointer kind, i.e. own or immut α . Note that the information on lifetimes is all stripped off. ( | X | ) := X ( | µX.T | ) = µX. ( | T | ) ( | ˇ P T | ) := box ( | T | ) ( | mut α T | ) := mut ( | T | )ustHorn: CHC-based Verification for Rust Programs (full version) 15( | int | ) := int ( | unit | ) := unit ( | T + T | ) := ( | T | ) + ( | T | ) ( | T × T | ) := ( | T | ) × ( | T | ) We introduce a special variable res to represent the result of a function. Fora label L in a function f in a program Π , we define ˇ ϕ Π,f,L , Ξ Π,f,L and ∆ Π,f,L as follows, if the items in the variable context for the label are enumerated as x : a T , . . . , x n − : a n − T n − and the return type of the function is U .ˇ ϕ Π,f,L := f L ( x , . . . , x n − , res ) Ξ Π,f,L := (( | T | ) , . . . , ( | T n − | ) , ( | U | )) ∆ Π,f,L := { ( x i , ( | T i | )) | i ∈ [ n ] } + { ( res , ( | U | )) }∀ ( ∆ ) stands for ∀ x : σ , . . . , x n − : σ n − , where the items in ∆ are enumeratedas ( x , σ ) , . . . , ( x n − , σ n − ). CHC Representation.
Now we introduce ‘( | L : S | ) Π,f ’, the set (in most cases,singleton) of CHCs modeling the computation performed by the labeled state-ment L : S in f from Π . Unlike informal descriptions in §
1, we turn to patternmatching instead of equations, to simplify the proofs in Appendix C.3. Belowwe show some of the rules; the complete rules are presented in Appendix B. Thevariables marked green (e.g. x ◦ ) should be fresh. The following is the rule formutable (re)borrow. ( | L : let y = mutbor α x ; goto L (cid:48) | ) Π,f := (cid:26) ∀ ( ∆ Π,f,L + { ( x ◦ , ( | T | )) } ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x, x ◦ (cid:105) /y, (cid:104) x ◦ (cid:105) /x ] (cid:27) (Ty Π,f,L ( x ) = own T ) (cid:26) ∀ ( ∆ Π,f,L + { ( x ◦ , ( | T | )) } ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x, x ◦ (cid:105) /y, (cid:104) x ◦ , ◦ x (cid:105) /x ] (cid:27) (Ty Π,f,L ( x ) = mut α T ) The value at the end of borrow is represented as a newly introduced variable x ◦ .Below is the rule for release of a variable. ( | L : drop x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:9) (Ty
Π,f,L ( x ) = ˇ P T ) (cid:26) ∀ ( ∆ Π,f,L −{ ( x, mut ( | T | )) } + { ( x ∗ , ( | T | )) } ) . ˇ ϕ Π,f,L [ (cid:104) x ∗ , x ∗ (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:27) (Ty
Π,f,L ( x ) = mut α T ) When a variable x of type mut α T is dropped/released, we check the prophesiedvalue at the end of borrow. Below is the rule for a function call. ( | L : let y = g (cid:104)· · ·(cid:105) ( x , . . . , x n − ); goto L (cid:48) | ) Π,f := {∀ ( ∆ Π,f,L + { ( y, ( | Ty Π,f,L (cid:48) ( y ) | )) } ) . ˇ ϕ Π,f,L ⇐ = g entry ( x , . . . , x n − , y ) ∧ ˇ ϕ Π,f,L (cid:48) } The body (the right-hand side of ⇐ = ) of the CHC contains two formulas, whichyields a kind of call stack at the level of CHCs. Below is the rule for a returnfrom a function. ( | L : return x | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L [ x/ res ] ⇐ = (cid:62) (cid:9) The variable res is forced to be equal to the returned variable x . For simplicity, we assume that the parameters of each function are sorted respecting some fixed order on variables (with res coming at the last), and we enumerate variousitems in this fixed order.6 Y. Matsushita et al.
Finally, ( | Π | ), the CHC system that represents the COR program Π (or the CHC representation of Π ), is defined as follows. ( | Π | ) := (cid:0)(cid:80) F in Π, L : S ∈ LabelStmt F ( | L : S | ) Π, name( F ) , ( Ξ Π,f,L ) f L s.t. ( f,L ) ∈ FnLabel Π (cid:1) Example 2 (CHC Representation).
We present below the CHC representationof take-max described in § inc-max here. We have alsoexcluded the variable binders ‘ ∀ · · · ’. take-max entry ( ma , mb , res ) ⇐ = take-max L1 ( ma , mb , (cid:104)∗ ma > = ∗ mb (cid:105) , res ) take-max L1 ( ma , mb , (cid:104) inj ord ∗ ! (cid:105) , res ) ⇐ = take-max L2 ( ma , mb , (cid:104) ord ∗ ! (cid:105) , res ) take-max L1 ( ma , mb , (cid:104) inj ord ∗ ! (cid:105) , res ) ⇐ = take-max L5 ( ma , mb , (cid:104) ord ∗ ! (cid:105) , res ) take-max L2 ( ma , mb , ou , res ) ⇐ = take-max L3 ( ma , mb , res ) take-max L3 ( ma , (cid:104) mb ∗ , mb ∗ (cid:105) , res ) ⇐ = take-max L4 ( ma , res ) take-max L4 ( ma , ma ) ⇐ = (cid:62) take-max L5 ( ma , mb , ou , res ) ⇐ = take-max L6 ( ma , mb , res ) take-max L6 ( (cid:104) ma ∗ , ma ∗ (cid:105) , mb , res ) ⇐ = take-max L7 ( mb , res ) take-max L7 ( mb , mb ) ⇐ = (cid:62) The fifth and eighth CHC represent release of mb / ma . The sixth and ninth CHCrepresent the determination of the return value res . Now we formally state and prove the correctness of the CHC representation.
Notations.
We use {|· · ·|} (instead of {· · · } ) for multisets. A ⊕ B (or more generally (cid:76) λ A λ ) denotes the multiset sum. For example, {| , |} ⊕ {| |} = {| , , |} (cid:54) = {| , |} . Readout and Safe Readout.
We introduce a few judgments to formally de-scribe how read out data from the heap.First, the judgment ‘readout H ( ∗ a :: T | v ; M )’ (the data at the address a oftype T can be read out from the heap H as the value v , yielding the memoryfootprint M ) is defined as follows. Here, a memory footprint M is a finitemultiset of addresses, which is employed for monitoring the memory usage. H ( a ) = a (cid:48) readout H ( ∗ a (cid:48) :: T | v ; M )readout H ( ∗ a : own T | (cid:104) v (cid:105) ; M⊕{| a |} ) readout H ( ∗ a :: T [ µX.T /X ] | v ; M )readout H ( ∗ a :: µX.T /X | v ; M ) H ( a ) = n readout H ( ∗ a :: int | n ; {| a |} ) readout H ( ∗ a :: unit | (); ∅ ) H ( a ) = i ∈ [2] for any k ∈ [( T − i − T i ) ≥ ] , H ( a +1+ T i + k ) = 0readout H ( ∗ ( a +1) :: T i | v ; M )readout H (cid:0) ∗ a :: T + T | inj i v ; M⊕{| a |}⊕{| a +1+ T i + k | k ∈ [( T − i − T i ) ≥ ] |} (cid:1) Here we can ignore mutable/immutable references, because we focus on what we call simple functions, as explained later.ustHorn: CHC-based Verification for Rust Programs (full version) 17( n ) ≥ := max { n, } readout H (cid:0) ∗ a :: T | v ; M (cid:1) readout H (cid:0) ∗ ( a + T ) :: T | v ; M (cid:1) readout H (cid:0) ∗ a :: T × T | ( v , v ); M ⊕M ) For example, ‘readout { (100 , , (101 , } ( ∗
100 :: int × int | (7 , {| , |} )’ holds.Next, ‘readout H ( F :: Γ | F ; M )’ (the data of the stack frame F respectingthe variable context Γ can be read out from H as F , yielding M ) is defined asfollows. dom Γ stands for { x | x : a T ∈ Γ } . dom F = dom Γ for any x : own T ∈ Γ , readout H ( ∗ F ( x ) :: T | v x ; M x )readout H ( F :: Γ | { ( x, (cid:104) v x (cid:105) ) | x ∈ dom F } ; (cid:76) x ∈ dom F M x ) Finally, ‘safe H ( F :: Γ | F )’ (the data of F respecting Γ can be safely readout from H as F ) is defined as follows. readout H ( F :: Γ | F ; M ) M has no duplicate itemssafe H ( F :: Γ | F ) Here, the ‘no duplicate items’ precondition checks the safety on the ownership.
COS-based Model.
Now we introduce the
COS-based model (COS stands forconcrete operational semantics) f COS Π to formally describe the expected input-output relation. Here, for simplicity, f is restricted to one that does not takelifetime parameters (we call such a function simple ; the input/output typesof a simple function cannot contain references). We define f COS Π as the pred-icate (on values of sorts ( | T | ) , . . . , ( | T n − | ) , ( | U | ) if f ’s input/output types are T , . . . , T n − , U ) given by the following rule. C → Π · · · → Π C N final Π ( C N ) C = [ f, entry ] F | H C N = [ f, L ] F (cid:48) | H (cid:48) safe H (cid:0) F :: Γ Π,f, entry (cid:12)(cid:12) { ( x i , v i ) | i ∈ [ n ] } (cid:1) safe H (cid:48) (cid:0) F (cid:48) :: Γ Π,f,L (cid:12)(cid:12) { ( y, w ) } (cid:1) f COS Π ( v , . . . , v n − , w ) Γ Π,f,L : the variable context for the label L of f in the program Π Correctness Theorem.
Finally, the correctness (both soundness and com-pleteness) of the CHC representation is simply stated as follows.
Theorem 1 (Correctness of the CHC Representation).
For any program Π and simple function f in Π , f COS Π is equivalent to M least( | Π | ) ( f entry ) .Proof. The details are presented in Appendix C. We outline the proof below.First, we introduce abstract operational semantics (Appendix C.1), where weget rid of heaps and directly represent each variable in the program simply asa value with abstract variables , which is strongly related to prophecy variables (see § SLDC resolution (Appendix C.3) for CHC systems andfind a bisimulation between abstract operational semantics and SLDC resolution(Lemma 3), whereby we show that the
AOS-based model , defined analogously to the COS-based model, is equivalent to the least model of the CHC repre-sentation (Theorem 2). Moreover, we find a bisimulation between concrete andabstract operational semantics (Lemma 5) and prove that the COS-based modelis equivalent to the AOS-based model (Theorem 3).Finally, combining the equivalences of Theorem 2 and Theorem 3, we achievethe proof for the correctness of the CHC representation. (cid:117)(cid:116)
Interestingly, as by-products of the proof, we have also shown the soundnessof the type system in terms of preservation and progression, in both concrete andabstract operational semantics. See Appendix C.2 and Appendix C.4 for details.Simplification and generalization of the proofs is left for future work.
We give advanced examples of pointer-manipulating Rust programs and theirCHC representations. For readability, we write programs in Rust (with ghostannotations) instead of COR. In addition, CHCs are written in an informal stylelike §
1, preferring equalities to pattern matching.
Example 3.
Consider the following program, a variant of just_rec in § fn choose<'a>(ma: &'a mut i32, mb: &'a mut i32) -> &'a mut i32 {if rand() { drop mb; ma } else { drop ma; mb }}fn linger_dec<'a>(ma: &'a mut i32) -> bool {*ma -= 1; if rand() { drop ma; return true; }let mut b = rand(); let old_b = b; intro 'b; let mb = & 'b mut b;let r2 = linger_dec <'b> (choose <'b> (ma, mb)); now 'b; r2 && old_b >= b} Unlike just_rec , the function linger_dec can modify the local variable of anarbitrarily deep ancestor. Interestingly, each recursive call to linger_dec canintroduce a new lifetime 'b , which yields arbitrarily many layers of lifetimes.Suppose we wish to verify that linger_dec never returns false . If we use,like JustRec + in § h, h (cid:48) and the stackpointer sp , we have to discover the quantified invariant: ∀ i ≤ sp . h [ i ] ≥ h (cid:48) [ i ]. Incontrast, our approach reduces this verification problem to the following CHCs: Choose ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , r ) ⇐ = b ◦ = b ∧ r = (cid:104) a, a ◦ (cid:105) Choose ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , r ) ⇐ = a ◦ = a ∧ r = (cid:104) b, b ◦ (cid:105) LingerDec ( (cid:104) a, a ◦ (cid:105) , r ) ⇐ = a (cid:48) = a − ∧ a ◦ = a (cid:48) ∧ r = true LingerDec ( (cid:104) a, a ◦ (cid:105) , r ) ⇐ = a (cid:48) = a − ∧ oldb = b ∧ Choose ( (cid:104) a (cid:48) , a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , mc ) ∧ LingerDec ( mc , r (cid:48) ) ∧ r = ( r (cid:48) && oldb > = b ◦ ) r = true ⇐ = LingerDec ( (cid:104) a, a ◦ (cid:105) , r ) . This can be solved by many solvers since it has a very simple model:
Choose ( (cid:104) a, a ◦ (cid:105) , (cid:104) b, b ◦ (cid:105) , r ) : ⇐⇒ ( b ◦ = b ∧ r = (cid:104) a, a ◦ (cid:105) ) ∨ ( a ◦ = a ∧ r = (cid:104) b, b ◦ (cid:105) ) LingerDec ( (cid:104) a, a ◦ (cid:105) , r ) : ⇐⇒ r = true ∧ a ≥ a ◦ . ustHorn: CHC-based Verification for Rust Programs (full version) 19 Example 4.
Combined with recursive data structures , our method turns out tobe more interesting. Let us consider the following Rust code: enum List { Cons(i32, Box), Nil } use List::*;fn take_some<'a>(mxs: &'a mut List) -> &'a mut i32 {match mxs {Cons(mx, mxs2) => if rand() { drop mxs2; mx }else { drop mx; take_some <'a> (mxs2) }Nil => { take_some(mxs) }}}fn sum(xs: &List) -> i32 {match xs { Cons(x, xs2) => x + sum(xs2), Nil => 0 }}fn inc_some(mut xs: List) -> bool {let n = sum(&xs); intro 'a; let my = take_some <'a> (& 'a mut xs);*my += 1; drop my; now 'a; let m = sum(&xs); m == n + 1} This is a program that manipulates singly linked integer lists, defined as a re-cursive data type. take_some takes a mutable reference to a list and returnsa mutable reference to some element of the list. sum calculates the sum of theelements of a list. inc_some increments some element of a list via a mutablereference and checks that the sum of the elements of the list has increased by .Suppose we wish to verify that inc_some never returns false . Our methodtranslates this verification problem into the following CHCs. TakeSome ( (cid:104) [ x | xs (cid:48) ] , xs ◦ (cid:105) , r ) ⇐ = xs ◦ = [ x ◦ | xs (cid:48)◦ ] ∧ xs (cid:48)◦ = xs (cid:48) ∧ r = (cid:104) x, x ◦ (cid:105) TakeSome ( (cid:104) [ x | xs (cid:48) ] , xs ◦ (cid:105) , r ) ⇐ = xs ◦ = [ x ◦ | xs (cid:48)◦ ] ∧ x ◦ = x ∧ TakeSome ( (cid:104) xs (cid:48) , xs (cid:48)◦ (cid:105) , r ) TakeSome ( (cid:104) [] , xs ◦ (cid:105) , r ) ⇐ = TakeSome ( (cid:104) [] , xs ◦ (cid:105) , r ) Sum ( (cid:104) [ x | xs (cid:48) ] (cid:105) , r ) ⇐ = Sum ( (cid:104) xs (cid:48) (cid:105) , r (cid:48) ) ∧ r = x + r (cid:48) Sum ( (cid:104) [] (cid:105) , r ) ⇐ = r = 0 IncSome ( xs , r ) ⇐ = Sum ( (cid:104) xs (cid:105) , n ) ∧ TakeSome ( (cid:104) xs , xs ◦ (cid:105) , (cid:104) y, y ◦ (cid:105) ) ∧ y ◦ = y + 1 ∧ Sum ( (cid:104) xs ◦ (cid:105) , m ) ∧ r = ( m == n +1) r = true ⇐ = IncSome ( xs , r ) A crucial technique used here is subdivision of a mutable reference , which isachieved with the constraint xs ◦ = [ x ◦ | xs (cid:48)◦ ].We can give this CHC system a very simple model, using an auxiliary function sum (satisfying sum ([ x | xs (cid:48) ]) := x + sum ( xs (cid:48) ) , sum ([]) := 0): TakeSome ( (cid:104) xs , xs ◦ (cid:105) , (cid:104) y, y ◦ (cid:105) ) : ⇐⇒ y ◦ − y = sum ( xs ◦ ) − sum ( xs ) Sum ( (cid:104) xs (cid:105) , r ) : ⇐⇒ r = sum ( xs ) IncSome ( xs , r ) : ⇐⇒ r = true . In COR,
List can be expressed as µX. int × own X + unit . [ x | xs ] is the cons made of the head x and the tail xs . [] is the nil. In our formal logic,they are expressed as inj ( x, (cid:104) xs (cid:105) ) and inj ().0 Y. Matsushita et al. Although the model relies on the function sum , the validity of the model can bechecked without induction on sum (i.e. we can check the validity of each CHCjust by properly unfolding the definition of sum a few times).The example can be fully automatically and promptly verified by our approachusing HoIce [12,11] as the back-end CHC solver; see § We discuss here how our idea can be extended and enhanced.
Applying Various Verification Techniques.
Our idea can also be expressed as atranslation of a pointer-manipulating Rust program into a program of a statelessfunctional programming language , which allows us to use various verificationtechniques not limited to CHCs. Access to future information can be modeledusing non-determinism . To express the value a ◦ coming at the end of mutableborrow in CHCs, we just randomly guess the value with non-determinism. Atthe time we actually release a mutable reference, we just check a' = a and cutoff execution branches that do not pass the check.For example, take_max / inc_max in § let rec assume b = if b then () else assume blet take_max (a, a') (b, b') =if a >= b then (assume (b' = b); (a, a'))else (assume (a' = a); (b, b'))let inc_max a b =let a' = Random.int(0) in let b' = Random.int(0) inlet (c, c') = take_max (a, a') (b, b') inassume (c' = c + 1); not (a' = b')let main a b = assert (inc_max a b) ‘ let a' = Random.int(0) ’ expresses a random guess and ‘ assume (a' = a) ’expresses a check . The original problem “Does inc_max never return false ?”is reduced to the problem “Does main never fail at assertion?” on the OCamlprogram. This representation allows us to use various verification techniques, includingmodel checking (higher-order, temporal, bounded, etc.), semi-automated verifi-cation (e.g. on Boogie [48]) and verification on proof assistants (e.g. Coq [15]).The property to be verified can be not only partial correctness, but also totalcorrectness and liveness. Further investigation is left for future work. MoCHi [39], a higher-order model checker for OCaml, successfully verified the safetyproperty for the OCaml representation above. It also successfully and instantly ver-ified a similar representation of choose / linger_dec at Example 3.ustHorn: CHC-based Verification for Rust Programs (full version) 21 Verifying Higher-order Programs.
We have to care about the following points inmodeling closures: (i)
A closure that encloses mutable references can be encodedas a pair of the main function and the ‘drop function’ called when the closure isreleased; (ii)
A closure that updates enclosed data can be encoded as a functionthat returns, with the main return value, the updated version of the closure; (iii)
A closure that updates external data through enclosed mutable referencescan also be modeled by combination of (i) and (ii). Further investigation onverification of higher-order Rust programs is left for future work.
Libraries with Unsafe Code.
Our translation does not use lifetime information;the correctness of our method is guaranteed by the nature of borrow. Whereaslifetimes are used for static check of the borrow discipline, many libraries in Rust(e.g.
RefCell ) provide a mechanism for dynamic ownership check .We believe that such libraries with unsafe code can be verified for our methodby a separation logic such as Iris [35,33], as RustBelt [32] does. A good newsis that Iris has recently incorporated prophecy variables [34], which seems to fitwell with our approach. This is an interesting topic for future work.After the libraries are verified, we can turn to our method. For an easyexample,
Vec [58] can be represented simply as a functional array; a muta-ble/immutable slice &mut[T]/&[T] can be represented as an array of muta-ble/immutable references. For another example, to deal with
RefCell [56], wepass around an array that maps a
RefCell
Importantly, at the very time we take a mutable reference (cid:104) a, a ◦ (cid:105) from a ref-cell,the data at the array should be updated into a ◦ . Using methods such as pointeranalysis [61], we can possibly shrink the array.Still, our method does not go quite well with memory leaks [52] caused forexample by combination of RefCell and Rc [57], because they obfuscate theownership release of mutable references. We think that use of Rc etc. shouldrather be restricted for smooth verification. Further investigation is needed. We report on the implementation of our verification tool and the preliminaryexperiments conducted with small benchmarks to confirm the effectiveness ofour approach.
We implemented a prototype verification tool
RustHorn (available at https://github.com/hopv/rust-horn ) based on the ideas described above. The tool To borrow a mutable/immutable reference from
RefCell , we check and update thecounter and take out the data from the array. In Rust, we can use
RefCell to naturally encode data types with circular references(e.g. doubly-linked lists).2 Y. Matsushita et al. supports basic features of Rust supported in COR, including recursions andrecursive types especially.The implementation translates the MIR (Mid-level Intermediate Representa-tion) [45,51] of a Rust program into CHCs quite straightforwardly. Thanks tothe nature of the translation, RustHorn can just rely on Rust’s borrow check andforget about lifetimes. For efficiency, the predicate variables are constructed bythe granularity of the vertices in the control-flow graph in MIR, unlike the per-label construction of § § To measure the performance of RustHorn and the existing CHC-based verifierSeaHorn [23], we conducted preliminary experiments with benchmarks listed inTable 1. Each benchmark program is designed so that the Rust and C versionsmatch. Each benchmark instance consists of either one program or a pair of safeand unsafe programs that are very similar to each other. The benchmarks andexperimental results are accessible at https://github.com/hopv/rust-horn .The benchmarks in the groups simple and bmc were taken from SeaHorn( https://github.com/seahorn/seahorn/tree/master/test ), with the Rustversions written by us. They have been chosen based on the following criteria:they (i) consist of only features supported by core Rust, (ii) follow Rust’s owner-ship discipline, and (iii) are small enough to be amenable for manual translationfrom C to Rust.The remaining six benchmark groups are built by us and consist of programsfeaturing mutable references. The groups inc-max , just-rec and linger-dec are based on the examples that have appeared in § § swap-dec consists of programs that perform repeated involved updates via mu-table references to mutable references. The groups lists and trees featuredestructive updates on recursive data structures (lists and trees) via mutablereferences, with one interesting program of it explained in § In order to use the MIR, RustHorn’s implementation depends on the unstable nightlyversion of the Rust compiler, which causes a slight portability issue. For base/3 and repeat/3 of inc-max , the address-taking parts were already removed,probably by inaccurate pointer analysis.ustHorn: CHC-based Verification for Rust Programs (full version) 23 Table 1 shows the results of the experiments.Interestingly, the combination of RustHorn and HoIce succeeded in verify-ing many programs with recursive data types ( lists and trees ), although itfailed at difficult programs. HoIce, unlike Spacer, can find models defined withprimitive recursive functions for recursive data types. False alarms of SeaHorn for the last six groups are mainly due to problematicapproximation of SeaHorn for pointers and heap memories, as discussed in § § The combination of RustHorn andHoIce took a relatively long time or reported timeout for some programs, includ-ing unsafe ones, because HoIce is still an unstable tool compared to Spacer; ingeneral, automated CHC solving can be rather unstable.
CHC-based Verification of Pointer-Manipulating Programs.
SeaHorn [23] is arepresentative existing tool for CHC-based verification of pointer-manipulatingprograms. It basically represents the heap memory as an array. Although somepointer analyses [24] are used to optimize the array representation of the heap,their approach suffers from the scalability problem discussed in § §
4. Still, their approach is quite effective as automatedverification, given that many real-world pointer-manipulating programs do notfollow Rust-style ownership.Another approach is taken by JayHorn [37,36], which translates Java pro-grams (possibly using object pointers) to CHCs. They represent store invariantsusing special predicates pull and push . Although this allows faster reasoningabout the heap than the array-based approach, it can suffer from more falsealarms. We conducted a small experiment for JayHorn (0.6-alpha) on some ofthe benchmarks of § UNKNOWN ’ (instead of‘
SAFE ’ or ‘
UNSAFE ’) for even simple programs such as the programs of the instance unique-scalar in simple and the instance basic in inc-max . Verification for Rust.
Whereas we have presented the first CHC-based (fully au-tomated) verification method specially designed for Rust-style ownership, therehave been a number of studies on other types of verification for Rust. For example, inc-some/2 takes two mutable references in a list and increments onthem; inc-all-t destructively increments all elements in a tree. We used the latest version of HoIce, whose algorithm for recursive types is presentedin the full paper of [11]. We also tried on Spacer
JustRec + , the stack-pointer-based accurate representationof just_rec presented in § RustHorn SeaHorn w/SpacerGroup Instance Property w/Spacer w/HoIce as is modified simple 01 safe < < < safe 0.5 timeout 0.8 unsafe < < < safe timeout 0.1 timeout hhk2008 safe timeout 40.5 < unique-scalar unsafe < < < bmc 1 safe 0.2 < < < < safe timeout 0.1 < < < < safe < < < < < < diamond-1 safe 0.1 < < < < < diamond-2 safe 0.2 < < < < < inc-max base safe < < < < < < < base/3 safe < < < < repeat safe 0.1 timeout false alarm 0.1unsafe < < < repeat/3 safe 0.2 timeout < < < swap-dec base safe < < < < < base/3 safe 0.2 timeout false alarm < < exact safe 0.1 0.5 false alarm timeoutunsafe < < < exact/3 safe timeout timeout false alarm false alarmunsafe < < < just-rec base safe < < < < < linger-dec base safe < < < < base/3 safe < < < < exact safe < < < < exact/3 safe < < < < lists append safe tool error < inc-all safe tool error < < inc-some safe tool error < inc-some/2 safe tool error timeout false alarmunsafe tool error 0.3 0.4 trees append-t safe tool error < inc-all-t safe tool error timeout timeoutunsafe tool error 0.1 < inc-some-t safe tool error timeout timeoutunsafe tool error 0.3 0.1 inc-some/2-t safe tool error timeout false alarmunsafe tool error 0.4 0.1 Table 1.
Benchmarks and experimental results on RustHorn and SeaHorn, withSpacer/Z3 and HoIce. “timeout” denotes timeout of 180 seconds; “false alarm” meansreporting ‘unsafe’ for a safe program; “tool error” is a tool error of Spacer, whichcurrently does not deal with recursive types well.ustHorn: CHC-based Verification for Rust Programs (full version) 25
RustBelt [32] aims to formally prove high-level safety properties for Rustlibraries with unsafe internal implementation, using manual reasoning on thehigher-order concurrent separation logic Iris [35,33] on the Coq Proof Assistant[15]. Although their framework is flexible, the automation of the reasoning onthe framework is little discussed. The language design of our COR is affected bytheir formal calculus λ Rust .Electrolysis [67] translates some subset of Rust into a purely functional pro-gramming language to manually verify functional correctness on Lean TheoremProver [49]. Although it clears out pointers to get simple models like our ap-proach, Electrolysis’ applicable scope is quite limited, because it deals with mu-table references by simple static tracking of addresses based on lenses [20], notsupporting even basic use cases such as dynamic selection of mutable references(e.g. take_max in § all usages of pointers of the safe core of Rust as discussed in § fractional ownership , however,their platforms have to use concrete indexing on the memory for programs like take_max / inc_max . In contrast, our idea leverages borrow-based ownership , andit can be applied also to semi-automated verification as suggested in § § Verification using Ownership.
Ownership has been applied to a wide range ofverification. It has been used for detecting race conditions on concurrent pro-grams [8,64] and analyzing the safety of memory allocation [63]. Separation logicbased on ownership is also studied well [7,50,35]. Some verification platforms[14,5,21] support simple ownership. However, most prior studies on ownership-based verification are based on fractional or counting ownership. Verificationunder borrow-based ownership like Rust was little studied before our work.
Prophecy Variables.
Our idea of taking a future value to represent a mutablereference is linked to the notion of prophecy variables [1,68,34]. Jung et al. [34]propose a new Hoare-style logic with prophecy variables. In their logic, prophecyvariables are not copyable, which is analogous to uncopyability of mutable ref-erences in Rust. This logic can probably be used for generalizing our idea assuggested in § We have proposed a novel method for CHC-based program verification, whichrepresents a mutable reference as a pair of values, the current value and the future value at the time of release. We have formalized the method for a corelanguage of Rust and proved its correctness. We have implemented a proto-type verification tool for a subset of Rust and confirmed the effectiveness of ourapproach. We believe that this study establishes the foundation of verificationleveraging borrow-based ownership.
Acknowledgments
This work was supported by JSPS KAKENHI Grant Num-ber JP15H05706 and JP16K16004. We are grateful to the anonymous reviewersfor insightful comments.
References
1. Abadi, M., Lamport, L.: The existence of refinement mappings. Theor. Comput.Sci. (2), 253–284 (1991). https://doi.org/10.1016/0304-3975(91)90224-P2. Alberti, F., Bruttomesso, R., Ghilardi, S., Ranise, S., Sharygina, N.: Lazy ab-straction with interpolants for arrays. In: Bjørner, N., Voronkov, A. (eds.)Logic for Programming, Artificial Intelligence, and Reasoning - 18th Interna-tional Conference, LPAR-18, M´erida, Venezuela, March 11-15, 2012. Proceed-ings. Lecture Notes in Computer Science, vol. 7180, pp. 46–61. Springer (2012).https://doi.org/10.1007/978-3-642-28717-6 73. Astrauskas, V., M¨uller, P., Poli, F., Summers, A.J.: Leveraging Rust typesfor modular specification and verification (2018). https://doi.org/10.3929/ethz-b-0003110924. Baranowski, M.S., He, S., Rakamaric, Z.: Verifying Rust programs with SMACK.In: Lahiri and Wang [42], pp. 528–535. https://doi.org/10.1007/978-3-030-01090-4 325. Barnett, M., F¨ahndrich, M., Leino, K.R.M., M¨uller, P., Schulte, W., Venter, H.:Specification and verification: The Spec (6), 81–91(2011). https://doi.org/10.1145/1953122.19531456. Bjørner, N., Gurfinkel, A., McMillan, K.L., Rybalchenko, A.: Horn clausesolvers for program verification. In: Beklemishev, L.D., Blass, A., Dershowitz,N., Finkbeiner, B., Schulte, W. (eds.) Fields of Logic and Computation II- Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday.Lecture Notes in Computer Science, vol. 9300, pp. 24–51. Springer (2015).https://doi.org/10.1007/978-3-319-23534-9 27. Bornat, R., Calcagno, C., O’Hearn, P.W., Parkinson, M.J.: Permission accountingin separation logic. In: Palsberg, J., Abadi, M. (eds.) Proceedings of the 32ndACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,POPL 2005, Long Beach, California, USA, January 12-14, 2005. pp. 259–270. ACM(2005). https://doi.org/10.1145/1040305.10403278. Boyapati, C., Lee, R., Rinard, M.C.: Ownership types for safe program-ming: Preventing data races and deadlocks. In: Ibrahim, M., Matsuoka,S. (eds.) Proceedings of the 2002 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications, OOPSLA 2002,Seattle, Washington, USA, November 4-8, 2002. pp. 211–230. ACM (2002).https://doi.org/10.1145/582419.5824409. Boyland, J.: Checking interference with fractional permissions. In: Cousot, R. (ed.)Static Analysis, 10th International Symposium, SAS 2003, San Diego, CA, USA,ustHorn: CHC-based Verification for Rust Programs (full version) 27June 11-13, 2003, Proceedings. Lecture Notes in Computer Science, vol. 2694, pp.55–72. Springer (2003). https://doi.org/10.1007/3-540-44898-5 410. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emer-son, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract In-terpretation, 7th International Conference, VMCAI 2006, Charleston, SC, USA,January 8-10, 2006, Proceedings. Lecture Notes in Computer Science, vol. 3855,pp. 427–442. Springer (2006). https://doi.org/10.1007/11609773 2811. Champion, A., Chiba, T., Kobayashi, N., Sato, R.: ICE-based refinement typediscovery for higher-order functional programs. In: Beyer, D., Huisman, M. (eds.)Tools and Algorithms for the Construction and Analysis of Systems - 24th Interna-tional Conference, TACAS 2018, Held as Part of the European Joint Conferenceson Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-20, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10805, pp.365–384. Springer (2018). https://doi.org/10.1007/978-3-319-89960-2 2012. Champion, A., Kobayashi, N., Sato, R.: HoIce: An ICE-based non-linear Hornclause solver. In: Ryu, S. (ed.) Programming Languages and Systems - 16th AsianSymposium, APLAS 2018, Wellington, New Zealand, December 2-6, 2018, Pro-ceedings. Lecture Notes in Computer Science, vol. 11275, pp. 146–156. Springer(2018). https://doi.org/10.1007/978-3-030-02768-1 813. Clarke, D.G., Potter, J., Noble, J.: Ownership types for flexible alias protection.In: Freeman-Benson, B.N., Chambers, C. (eds.) Proceedings of the 1998 ACMSIGPLAN Conference on Object-Oriented Programming Systems, Languages &Applications (OOPSLA ’98), Vancouver, British Columbia, Canada, October 18-22, 1998. pp. 48–64. ACM (1998). https://doi.org/10.1145/286936.28694714. Cohen, E., Dahlweid, M., Hillebrand, M.A., Leinenbach, D., Moskal, M., Santen,T., Schulte, W., Tobies, S.: VCC: A practical system for verifying concurrent C. In:Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) Theorem Proving in HigherOrder Logics, 22nd International Conference, TPHOLs 2009, Munich, Germany,August 17-20, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5674,pp. 23–42. Springer (2009). https://doi.org/10.1007/978-3-642-03359-9 215. Coq Team: The Coq proof assistant (2020), https://coq.inria.fr/
16. van Emden, M.H., Kowalski, R.A.: The semantics of predicate logic asa programming language. Journal of the ACM (4), 733–742 (1976).https://doi.org/10.1145/321978.32199117. Erdin, M.: Verification of Rust Generics, Typestates, and Traits. Master’s thesis,ETH Z¨urich (2019)18. Fedyukovich, G., Kaufman, S.J., Bod´ık, R.: Sampling invariants from frequencydistributions. In: Stewart, D., Weissenbacher, G. (eds.) 2017 Formal Methods inComputer Aided Design, FMCAD 2017, Vienna, Austria, October 2-6, 2017. pp.100–107. IEEE (2017). https://doi.org/10.23919/FMCAD.2017.810224719. Fedyukovich, G., Prabhu, S., Madhukar, K., Gupta, A.: Quantified invariants viasyntax-guided synthesis. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verifica-tion - 31st International Conference, CAV 2019, New York City, NY, USA, July15-18, 2019, Proceedings, Part I. Lecture Notes in Computer Science, vol. 11561,pp. 259–277. Springer (2019). https://doi.org/10.1007/978-3-030-25540-4 1420. Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Com-binators for bidirectional tree transformations: A linguistic approach to theview-update problem. ACM Trans. Program. Lang. Syst. (3), 17 (2007).https://doi.org/10.1145/1232420.12324248 Y. Matsushita et al.21. Gondelman, L.: Un syst`eme de types pragmatique pour la v´erification d´eductive desprogrammes. (A Pragmatic Type System for Deductive Verification). Ph.D. thesis,University of Paris-Saclay, France (2016), https://tel.archives-ouvertes.fr/tel-01533090
22. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-ware verifiers from proof rules. In: Vitek, J., Lin, H., Tip, F. (eds.) ACMSIGPLAN Conference on Programming Language Design and Implementation,PLDI ’12, Beijing, China - June 11 - 16, 2012. pp. 405–416. ACM (2012).https://doi.org/10.1145/2254064.225411223. Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verificationframework. In: Kroening, D., Pasareanu, C.S. (eds.) Computer Aided Verification- 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9206, pp.343–361. Springer (2015). https://doi.org/10.1007/978-3-319-21690-4 2024. Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification ofC/C++ programs. In: Ranzato, F. (ed.) Static Analysis - 24th International Sym-posium, SAS 2017, New York, NY, USA, August 30 - September 1, 2017, Proceed-ings. Lecture Notes in Computer Science, vol. 10422, pp. 148–168. Springer (2017).https://doi.org/10.1007/978-3-319-66706-5 825. Gurfinkel, A., Shoham, S., Meshman, Y.: SMT-based verification of parameterizedsystems. In: Zimmermann, T., Cleland-Huang, J., Su, Z. (eds.) Proceedings ofthe 24th ACM SIGSOFT International Symposium on Foundations of SoftwareEngineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016. pp. 338–348.ACM (2016). https://doi.org/10.1145/2950290.295033026. Gurfinkel, A., Shoham, S., Vizel, Y.: Quantifiers on demand. In: Lahiri and Wang[42], pp. 248–266. https://doi.org/10.1007/978-3-030-01090-4 1527. Hahn, F.: Rust2Viper: Building a Static Verifier for Rust. Master’s thesis, ETHZ¨urich (2016). https://doi.org/10.3929/ethz-a-01066915028. Hoenicke, J., Majumdar, R., Podelski, A.: Thread modularity at many levels: Apearl in compositional verification. In: Castagna, G., Gordon, A.D. (eds.) Pro-ceedings of the 44th ACM SIGPLAN Symposium on Principles of ProgrammingLanguages, POPL 2017, Paris, France, January 18-20, 2017. pp. 473–485. ACM(2017). https://doi.org/10.1145/300983729. Hojjat, H., R¨ummer, P.: The
Eldarica
Horn solver. In: Bjørner, N., Gurfinkel,A. (eds.) 2018 Formal Methods in Computer Aided Design, FMCAD 2018,Austin, TX, USA, October 30 - November 2, 2018. pp. 1–7. IEEE (2018).https://doi.org/10.23919/FMCAD.2018.860301330. Horn, A.: On sentences which are true of direct unions of algebras. The Journal ofSymbolic Logic (1), 14–21 (1951),
31. Jim, T., Morrisett, J.G., Grossman, D., Hicks, M.W., Cheney, J., Wang, Y.: Cy-clone: A safe dialect of C. In: Ellis, C.S. (ed.) Proceedings of the General Track:2002 USENIX Annual Technical Conference, June 10-15, 2002, Monterey, Califor-nia, USA. pp. 275–288. USENIX (2002),
32. Jung, R., Jourdan, J., Krebbers, R., Dreyer, D.: RustBelt: Securing the founda-tions of the Rust programming language. PACMPL (POPL), 66:1–66:34 (2018).https://doi.org/10.1145/315815433. Jung, R., Krebbers, R., Jourdan, J., Bizjak, A., Birkedal, L., Dreyer, D.: Iris fromthe ground up: A modular foundation for higher-order concurrent separation logic.J. Funct. Program. , e20 (2018). https://doi.org/10.1017/S0956796818000151ustHorn: CHC-based Verification for Rust Programs (full version) 2934. Jung, R., Lepigre, R., Parthasarathy, G., Rapoport, M., Timany, A., Dreyer, D.,Jacobs, B.: The future is ours: Prophecy variables in separation logic. PACMPL (POPL), 45:1–45:32 (2020). https://doi.org/10.1145/337111335. Jung, R., Swasey, D., Sieczkowski, F., Svendsen, K., Turon, A., Birkedal, L.,Dreyer, D.: Iris: Monoids and invariants as an orthogonal basis for concurrentreasoning. In: Rajamani, S.K., Walker, D. (eds.) Proceedings of the 42nd AnnualACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,POPL 2015, Mumbai, India, January 15-17, 2015. pp. 637–650. ACM (2015).https://doi.org/10.1145/2676726.267698036. Kahsai, T., Kersten, R., R¨ummer, P., Sch¨af, M.: Quantified heap invariants forobject-oriented programs. In: Eiter, T., Sands, D. (eds.) LPAR-21, 21st Interna-tional Conference on Logic for Programming, Artificial Intelligence and Reasoning,Maun, Botswana, May 7-12, 2017. EPiC Series in Computing, vol. 46, pp. 368–384.EasyChair (2017)37. Kahsai, T., R¨ummer, P., Sanchez, H., Sch¨af, M.: JayHorn: A framework for ver-ifying Java programs. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Ver-ification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July17-23, 2016, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9779,pp. 352–358. Springer (2016). https://doi.org/10.1007/978-3-319-41528-4 1938. Kalra, S., Goel, S., Dhawan, M., Sharma, S.: Zeus : Analyzing safety of smartcontracts. In: 25th Annual Network and Distributed System Security Symposium,NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet So-ciety (2018)39. Kobayashi, N., Sato, R., Unno, H.: Predicate abstraction and CEGAR for higher-order model checking. In: Hall, M.W., Padua, D.A. (eds.) Proceedings of the 32ndACM SIGPLAN Conference on Programming Language Design and Implementa-tion, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. pp. 222–233. ACM (2011).https://doi.org/10.1145/1993498.199352540. Komuravelli, A., Gurfinkel, A., Chaki, S.: SMT-based model checking for recursiveprograms. In: Biere, A., Bloem, R. (eds.) Computer Aided Verification - 26th Inter-national Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL2014, Vienna, Austria, July 18-22, 2014. Proceedings. Lecture Notes in ComputerScience, vol. 8559, pp. 17–34. Springer (2014). https://doi.org/10.1007/978-3-319-08867-9 241. Lahiri, S.K., Bryant, R.E.: Constructing quantified invariants via predicate ab-straction. In: Steffen, B., Levi, G. (eds.) Verification, Model Checking, and Ab-stract Interpretation, 5th International Conference, VMCAI 2004, Venice, Italy,January 11-13, 2004, Proceedings. Lecture Notes in Computer Science, vol. 2937,pp. 267–281. Springer (2004). https://doi.org/10.1007/978-3-540-24622-0 2242. Lahiri, S.K., Wang, C. (eds.): Automated Technology for Verification and Analysis- 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October7-10, 2018, Proceedings, Lecture Notes in Computer Science, vol. 11138. Springer(2018). https://doi.org/10.1007/978-3-030-01090-443. Lattner, C., Adve, V.S.: Automatic pool allocation: Improving performance bycontrolling data structure layout in the heap. In: Sarkar, V., Hall, M.W. (eds.)Proceedings of the ACM SIGPLAN 2005 Conference on Programming LanguageDesign and Implementation, Chicago, IL, USA, June 12-15, 2005. pp. 129–142.ACM (2005). https://doi.org/10.1145/1065010.106502744. Lindner, M., Aparicius, J., Lindgren, P.: No panic! Verification of Rust programsby symbolic execution. In: 16th IEEE International Conference on Industrial Infor-0 Y. Matsushita et al.matics, INDIN 2018, Porto, Portugal, July 18-20, 2018. pp. 108–114. IEEE (2018).https://doi.org/10.1109/INDIN.2018.847199245. Matsakis, N.D.: Introducing MIR (2016), https://blog.rust-lang.org/2016/04/19/MIR.html
46. Matsakis, N.D., Klock, II, F.S.: The Rust language. In: Feldman, M., Taft, S.T.(eds.) Proceedings of the 2014 ACM SIGAda annual conference on High integritylanguage technology, HILT 2014, Portland, Oregon, USA, October 18-21, 2014. pp.103–104. ACM (2014). https://doi.org/10.1145/2663171.266318847. Matsushita, Y., Tsukada, T., Kobayashi, N.: Rusthorn: Chc-based verificationfor rust programs. In: M¨uller, P. (ed.) Programming Languages and Systems -29th European Symposium on Programming, ESOP 2020, Held as Part of theEuropean Joint Conferences on Theory and Practice of Software, ETAPS 2020,Dublin, Ireland, April 25-30, 2020, Proceedings. Lecture Notes in Computer Sci-ence, vol. 12075, pp. 484–514. Springer (2020). https://doi.org/10.1007/978-3-030-44914-8 1848. Microsoft: Boogie: An intermediate verification language (2020),
49. de Moura, L.M., Kong, S., Avigad, J., van Doorn, F., von Raumer, J.: TheLean theorem prover (system description). In: Felty, A.P., Middeldorp, A.(eds.) Automated Deduction - CADE-25 - 25th International Conference onAutomated Deduction, Berlin, Germany, August 1-7, 2015, Proceedings. Lec-ture Notes in Computer Science, vol. 9195, pp. 378–388. Springer (2015).https://doi.org/10.1007/978-3-319-21401-6 2650. M¨uller, P., Schwerhoff, M., Summers, A.J.: Viper: A verification infrastructurefor permission-based reasoning. In: Jobstmann, B., Leino, K.R.M. (eds.) Verifi-cation, Model Checking, and Abstract Interpretation - 17th International Con-ference, VMCAI 2016, St. Petersburg, FL, USA, January 17-19, 2016. Proceed-ings. Lecture Notes in Computer Science, vol. 9583, pp. 41–62. Springer (2016).https://doi.org/10.1007/978-3-662-49122-5 251. Rust Community: The MIR (Mid-level IR) (2020), https://rust-lang.github.io/rustc-guide/mir/index.html
52. Rust Community: Reference cycles can leak memory - the Rust programming lan-guage (2020), https://doc.rust-lang.org/book/ch15-06-reference-cycles.html
53. Rust Community: RFC 2025: Nested method calls (2020), https://rust-lang.github.io/rfcs/2025-nested-method-calls.html
54. Rust Community: RFC 2094: Non-lexical lifetimes (2020), https://rust-lang.github.io/rfcs/2094-nll.html
55. Rust Community: Rust programming language (2020),
56. Rust Community: std::cell::RefCell - Rust (2020), https://doc.rust-lang.org/std/cell/struct.RefCell.html
57. Rust Community: std::rc::Rc - Rust (2020), https://doc.rust-lang.org/std/rc/struct.Rc.html
58. Rust Community: std::vec::Vec - Rust (2020), https://doc.rust-lang.org/std/vec/struct.Vec.html
59. Rust Community: Two-phase borrows (2020), https://rust-lang.github.io/rustc-guide/borrow_check/two_phase_borrows.html ustHorn: CHC-based Verification for Rust Programs (full version) 3160. Sato, R., Iwayama, N., Kobayashi, N.: Combining higher-order model checking withrefinement type inference. In: Hermenegildo, M.V., Igarashi, A. (eds.) Proceedingsof the 2019 ACM SIGPLAN Workshop on Partial Evaluation and Program Manip-ulation, PEPM@POPL 2019, Cascais, Portugal, January 14-15, 2019. pp. 47–53.ACM (2019). https://doi.org/10.1145/3294032.329408161. Steensgaard, B.: Points-to analysis in almost linear time. In: Boehm, H., Jr., G.L.S.(eds.) Conference Record of POPL’96: The 23rd ACM SIGPLAN-SIGACT Sym-posium on Principles of Programming Languages, Papers Presented at the Sympo-sium, St. Petersburg Beach, Florida, USA, January 21-24, 1996. pp. 32–41. ACMPress (1996). https://doi.org/10.1145/237721.23772762. Stump, A., Barrett, C.W., Dill, D.L., Levitt, J.R.: A decision procedure for an ex-tensional theory of arrays. In: 16th Annual IEEE Symposium on Logic in ComputerScience, Boston, Massachusetts, USA, June 16-19, 2001, Proceedings. pp. 29–37.IEEE Computer Society (2001). https://doi.org/10.1109/LICS.2001.93248063. Suenaga, K., Kobayashi, N.: Fractional ownerships for safe memory dealloca-tion. In: Hu, Z. (ed.) Programming Languages and Systems, 7th Asian Sym-posium, APLAS 2009, Seoul, Korea, December 14-16, 2009. Proceedings. Lec-ture Notes in Computer Science, vol. 5904, pp. 128–143. Springer (2009).https://doi.org/10.1007/978-3-642-10672-9 1164. Terauchi, T.: Checking race freedom via linear programming. In: Gupta, R., Ama-rasinghe, S.P. (eds.) Proceedings of the ACM SIGPLAN 2008 Conference on Pro-gramming Language Design and Implementation, Tucson, AZ, USA, June 7-13,2008. pp. 1–10. ACM (2008). https://doi.org/10.1145/1375581.137558365. Toman, J., Pernsteiner, S., Torlak, E.: crust : A bounded verifier for Rust.In: Cohen, M.B., Grunske, L., Whalen, M. (eds.) 30th IEEE/ACM Interna-tional Conference on Automated Software Engineering, ASE 2015, Lincoln,NE, USA, November 9-13, 2015. pp. 75–80. IEEE Computer Society (2015).https://doi.org/10.1109/ASE.2015.7766. Ullrich, S.: Electrolysis reference (2016), http://kha.github.io/electrolysis/
67. Ullrich, S.: Simple Verification of Rust Programs via Functional Purification. Mas-ter’s thesis, Karlsruhe Institute of Technology (2016)68. Vafeiadis, V.: Modular fine-grained concurrency verification. Ph.D. thesis, Univer-sity of Cambridge, UK (2008), http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612221
69. Z3 Team: The Z3 theorem prover (2020), https://github.com/Z3Prover/z3
Open Access
This chapter is licensed under the terms of the Creative CommonsAttribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in anymedium or format, as long as you give appropriate credit to the original author(s) andthe source, provide a link to the Creative Commons license and indicate if changeswere made.The images or other third party material in this chapter are included in the chapter’sCreative Commons license, unless indicated otherwise in a credit line to the material. Ifmaterial is not included in the chapter’s Creative Commons license and your intendeduse is not permitted by statutory regulation or exceeds the permitted use, you will needto obtain permission directly from the copyright holder.2 Y. Matsushita et al.
A Complementary Definitions on COR
A.1 Complete Typing Rules for Instructions
The following is the complete rules for the typing judgment on instructions I : Π,f ( Γ , A ) → ( Γ (cid:48) , A (cid:48) ). The variables on the right-hand side of one instructionshould be mutually distinct. The rules for subtyping T ≤ A U are explained later. α / ∈ A ex Π,f P = own , mut β for any γ ∈ Lifetime
P T , α ≤ A γ let y = mutbor α x : Π,f ( Γ + { x : P T } , A ) → ( Γ + { y : mut α T, x : † α P T } , A )if T is of form own U , every own and mut α in U is guarded by some immut β drop x : Π,f ( Γ + { x : T } , A ) → ( Γ , A ) immut x : Π,f ( Γ + { x : mut α T } , A ) → ( Γ + { x : immut α T } , A ) x : mut α T, y : P T ∈ Γ P = own , mut β swap ( ∗ x, ∗ y ) : Π,f ( Γ , A ) → ( Γ , A ) let ∗ y = x : Π,f ( Γ + { x : T } , A ) → ( Γ + { y : own T } , A ) let y = ∗ x : Π,f ( Γ + { x : P P (cid:48) T } , A ) → ( Γ + { y : ( P ◦ P (cid:48) ) T } , A ) P ◦ own = own ◦ P := P R α ◦ R (cid:48) β := R (cid:48)(cid:48) α where R (cid:48)(cid:48) = (cid:26) mut ( R = R (cid:48) = mut ) immut (otherwise) x : P T ∈ Γ T : copylet ∗ y = copy ∗ x : Π,f ( Γ , A ) → ( Γ + { y : own T } , A ) int : copy unit : copy immut α T : copy T : copy µX.T : copy T , T : copy T + T : copy T , T : copy T × T : copy T ≤ A Ux as U : Π,f ( Γ + { x : T } , A ) → ( Γ + { x : U } , A ) Σ Π,g = (cid:104) α (cid:48) , . . . , α (cid:48) m − | α (cid:48) a ≤ α (cid:48) b , . . . , α (cid:48) a l − ≤ α (cid:48) b l − (cid:105) ( x (cid:48) : T (cid:48) , . . . , x (cid:48) n − : T (cid:48) n − ) → T (cid:48) n for any j ∈ [ l ] , α a j ≤ A α b j for any i ∈ [ n +1] , T i = T (cid:48) i [ α /α (cid:48) , . . . , α m − /α (cid:48) m − ] let y = g (cid:104) α , . . . , α m − (cid:105) ( x , . . . , x n − ) : Π,f ( Γ + { x i : T i | i ∈ [ n ] } , A ) → ( Γ + { y : T n } , A ) Σ Π,f : the function signature of the function f in Π intro α : Π,f (cid:0) Γ , ( A, R ) (cid:1) → (cid:0) Γ , ( { α } + A, { α }× ( { α } + A ex Π,f )+ R ) (cid:1) α / ∈ A ex Π,f now α : Π,f (cid:0) Γ , ( { α } + A, R ) (cid:1) → (cid:0) { thaw α ( x : a T ) | x : a T ∈ Γ } , ( A, { ( β, γ ) ∈ R | β (cid:54) = α } ) (cid:1) thaw α ( x : a T ) := (cid:26) x : T ( a = † α ) x : a T (otherwise) α, β / ∈ A ex Π,f α ≤ β : Π,f (cid:0) Γ , ( A, R ) (cid:1) → (cid:0) Γ , ( A, ( { ( α, β ) } ∪ R ) + ) (cid:1) let ∗ y = const : Π,f ( Γ , A ) → ( Γ + { y : own T const } , A ) T const : the type of const ( int or unit ) x : P int , x (cid:48) : P (cid:48) int ∈ Γ let ∗ y = ∗ x op ∗ x (cid:48) : Π,f ( Γ , A ) → ( Γ + { y : own T op } , A )ustHorn: CHC-based Verification for Rust Programs (full version) 33 T op : the output type of op ( int or bool ) let ∗ y = rand () : Π,f ( Γ , A ) → ( Γ + { y : own int } , A ) let ∗ y = inj T + T i ∗ x : Π,f ( Γ + { x : own T i } , A ) → ( Γ + { y : own ( T + T ) } , A ) let ∗ y = ( ∗ x , ∗ x ) : Π,f ( Γ + { x : own T , x : own T } , A ) → ( Γ + { y : own ( T × T ) } , A ) let ( ∗ y , ∗ y ) = ∗ x : Π,f ( Γ + { x : P ( T × T ) } , A ) → ( Γ + { y : P T , y : P T } , A ) Rule for Drop.
The precondition for the typing rule on drop x is just for sim-plicity on formal definitions. For concrete operational semantics, a non-guarded own within own U causes nested releases of memory cells. For translation toCHCs, a non-guarded mut within own U would make value checks complicated.This precondition does not weaken the expressivity, because we can dividepointers by dereference ( let y = ∗ x ), pair destruction ( let ( ∗ y , ∗ y ) = ∗ x ) andvariant destruction ( match ∗ x {· · ·} ) (possibly using loops/recursions, for recur-sive types). Rule for Swap.
We can omit swap between two owning pointers because it isessentially the same thing with just swapping the names of the pointers. Notethat an active (i.e. not frozen) owning pointer has no other alias at all.
Subtyping.
The subtyping judgment Ξ (cid:96) T ≤ A U is defined as follows. Here, Ξ is a set of assumptions of form T ≤ U , which is used for subtyping on recursivetypes. ∅ (cid:96) T ≤ A U can be shortened into T ≤ A U . T ≤ U ∈ ΞΞ (cid:96) T ≤ A U Ξ (cid:96) T ≤ A UΞ (cid:96) ˇ P T ≤ A ˇ P U Ξ (cid:96) T ≤ A U, U ≤ A TΞ (cid:96) mut α T ≤ A mut α U Ξ (cid:96) β ≤ A αΞ (cid:96) R α T ≤ A R β TΞ (cid:96) T ≤ A U , T ≤ A U Ξ (cid:96) T + T ≤ A U + U Ξ (cid:96) T ≤ A U , T ≤ A U Ξ (cid:96) T × T ≤ A U × U Ξ (cid:96) µX.T ≤ A T [ µX.T /X ] , T [ µX.T /X ] ≤ A µX.TX (cid:48) , Y (cid:48) are fresh in Ξ Ξ + { X (cid:48) ≤ Y (cid:48) } (cid:96) T [ X (cid:48) /X ] ≤ A U [ Y (cid:48) /Y ] Ξ (cid:96) µX.T ≤ A µY.UX (cid:48) , Y (cid:48) are fresh in ΞΞ + { X (cid:48) ≤ Y (cid:48) , Y (cid:48) ≤ X (cid:48) } (cid:96) T [ X (cid:48) /X ] ≤ A U [ Y (cid:48) /Y ] , U [ Y (cid:48) /Y ] ≤ A T [ X (cid:48) /X ] Ξ (cid:96) µX.T ≤ A µY.U, µY.U ≤ A µX.TΞ (cid:96) T ≤ A T Ξ (cid:96) T ≤ A T (cid:48) , T (cid:48) ≤ A T (cid:48)(cid:48) Ξ (cid:96) T ≤ A T (cid:48)(cid:48) A.2 Complete Rules and an Example Execution for ConcreteOperational Semantics
The following is the complete rules for the judgments C → Π C (cid:48) and final Π ( C ). S Π,f,L = let y = mutbor α x ; goto L (cid:48) F ( x ) = a [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, a ) } ; S | H S Π,f,L = drop x ; goto L (cid:48) Ty Π,f,L ( x ) = own T [ f, L ] F + { ( x, a ) } ; S | H + { ( a + k, n k ) | k ∈ [ T ] } → Π [ f, L (cid:48) ] F ; S | H S Π,f,L = drop x ; goto L (cid:48) Ty Π,f,L ( x ) = R α T [ f, L ] F + { ( x, a ) } ; S | H → Π [ f, L (cid:48) ] F ; S | H S Π,f,L = immut x ; goto L (cid:48) [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F ; S | H S Π,f,L = swap ( ∗ x, ∗ y ); goto L (cid:48) Ty Π,f,L ( x ) = P T F ( x ) = a F ( y ) = b [ f, L ] F ; S | H + { ( a + k, m k ) | k ∈ [ T ] } + { ( b + k, n k ) | k ∈ [ T ] }→ Π [ f, L (cid:48) ] F ; S | H + { ( a + k, n k ) | k ∈ [ T ] } + { ( b + k, m k ) | k ∈ [ T ] } S Π,f,L = let ∗ y = x ; goto L (cid:48) [ f, L ] F + { ( x, a (cid:48) ) } ; S | H → Π [ f, L (cid:48) ] F + { ( y, a ) } ; S | H + { ( a, a (cid:48) ) } S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = own P T [ f, L ] F + { ( x, a ) } ; S | H + { ( a, a (cid:48) ) } → Π [ f, L (cid:48) ] F + { ( y, a (cid:48) ) } ; S | H S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = R α P T H ( a ) = a (cid:48) [ f, L ] F + { ( x, a ) } ; S | H → Π [ f, L (cid:48) ] F + { ( y, a (cid:48) ) } ; S | H S Π,f,L = let ∗ y = copy ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = P T F ( x ) = a [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, b ) } ; S | H + { ( b + k, H ( a + k )) | k ∈ [ T ] } S Π,f,L = I ; goto L (cid:48) I = x as T, intro α, now α, α ≤ β [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F ; S | H S Π,f,L = let y = g (cid:104)· · ·(cid:105) ( x , . . . , x n − ); goto L (cid:48) Σ Π,g = (cid:104)· · ·(cid:105) ( x (cid:48) : T , . . . , x (cid:48) n − : T n − ) → U [ f, L ] F + { ( x i , a i ) | i ∈ [ n ] } ; S | H → Π [ g, entry ] { ( x (cid:48) i , a i ) | i ∈ [ n ] } ; [ f, L ] y, F ; S | H S Π,f,L = return x [ f, L ] { ( x, a ) } ; [ g, L (cid:48) ] x (cid:48) , F (cid:48) ; S | H → Π [ g, L (cid:48) ] F (cid:48) + { ( x (cid:48) , a ) } ; S | H S Π,f,L = return x final Π (cid:0) [ f, L ] { ( x, a ) } | H (cid:1) S Π,f,L = let ∗ y = const ; goto L (cid:48) H (cid:48) = (cid:26) { ( a, n ) } ( const = n ) ∅ ( const = ())[ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, a ) } ; S | H + H (cid:48) S Π,f,L = let ∗ y = ∗ x op ∗ x (cid:48) ; goto L (cid:48) F ( x ) = a F ( x (cid:48) ) = a (cid:48) [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, b ) } ; S | H + { ( b, H ( a ) (cid:104) op (cid:105) H ( a (cid:48) )) }(cid:104) op (cid:105) : op as a binary operation on integers, with true / false encoded as 1/0 S Π,f,L = let ∗ y = rand (); goto L (cid:48) [ f, L ] F ; S | H → Π [ f, L (cid:48) ] F + { ( y, a ) } ; S | H + { ( a, n ) } S Π,f,L = let ∗ y = inj T + T i ∗ x ; goto L (cid:48) H = { ( a (cid:48) +1+ T i + k, | k ∈ [( T − i − T i ) ≥ ] } [ f, L ] F + { ( x, a ) } ; S | H + { ( a + k, m k ) | k ∈ [ T i ] }→ Π [ f, L (cid:48) ] F + { ( y, a (cid:48) ) } ; S | H + { ( a (cid:48) , i ) } + { ( a (cid:48) +1+ k, m k ) | k ∈ [ T i ] } + H ustHorn: CHC-based Verification for Rust Programs (full version) 35 S Π,f,L = match ∗ x { inj ∗ y → goto L (cid:48) , inj ∗ y → goto L (cid:48) } Ty Π,f,L ( x ) = own ( T + T ) i ∈ [2] H = { ( a +1+ T i + k, | k ∈ [( T − i − T i ) ≥ ] } [ f, L ] F + { ( x, a ) } ; S | H + { ( a, i ) } + { ( a +1+ k, m k ) | k ∈ [ T i ] } + H → Π [ f, L (cid:48) i ] F + { ( y i , a +1) } ; S | H + { ( a +1+ k, m k ) | k ∈ [ T i ] } S Π,f,L = match ∗ x { inj ∗ y → goto L (cid:48) , inj ∗ y → goto L (cid:48) } Ty Π,f,L ( x ) = R α ( T + T ) H ( a ) = i ∈ [2][ f, L ] F + { ( x, a ) } ; S | H → Π [ f, L (cid:48) i ] F + { ( y i , a +1) } ; S | H S Π,f,L = let ∗ y = ( ∗ x , ∗ x ); goto L (cid:48) for each i ∈ [2] , Ty Π,f,L ( x i ) = own T i [ f, L ] F + { ( x , a ) , ( x , a ) } ; S | H + { ( a i + k, m ik ) | i ∈ [2] , k ∈ [ T i ] }→ Π [ f, L (cid:48) ] F + { ( y, a (cid:48) ) } ; S | H + { ( a (cid:48) + i T + k, m ik ) | i ∈ [2] , k ∈ [ T i ] } S Π,f,L = let ( ∗ y , ∗ y ) = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = P ( T × T )[ f, L ] F + { ( x, a ) } ; S | H → Π [ f, L (cid:48) ] F + { ( y , a ) , ( y , a + T ) } ; S | H Example 5 (Execution on Concrete Operational Semantics).
The following is anexample execution for the COR program of Example 1. ♠ , ♥ , ♦ , ♣ representsome distinct addresses (e.g. 100 , , , → Π is abbreviated as → . [ inc-max , entry ] { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ inc-max , L1 ] { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ + [ inc-max , L3 ] { ( ma , ♠ ) , ( mb , ♥ ) , ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ take-max , entry ] { ( ma , ♠ ) , ( mb , ♥ ) } ;[ inc-max , L4 ] mc , { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ take-max , L1 ] { ( ord , ♦ ) , ( ma , ♠ ) , ( mb , ♥ ) } ;[ inc-max , L4 ] mc , { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , , ( ♦ , }→ [ take-max , L2 ] { ( ou , ♦ +1) , ( ma , ♠ ) , ( mb , ♥ ) } ;[ inc-max , L4 ] mc , { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ + [ take-max , L4 ] { ( ma , ♠ ) } ;[ inc-max , L4 ] mc , { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ inc-max , L4 ] { ( mc , ♠ ) , ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ inc-max , L5 ] { ( o1 , ♦ ) , ( mc , ♠ ) , ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , , ( ♦ , }→ + [ inc-max , L7 ] { ( oc (cid:48) , ♣ ) , ( mc , ♠ ) , ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , , ( ♣ , }→ [ inc-max , L8 ] { ( oc (cid:48) , ♣ ) , ( mc , ♠ ) , ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , , ( ♣ , }→ + [ inc-max , L10 ] { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ [ inc-max , L11 ] { ( oa , ♠ ) , ( ob , ♥ ) } | { ( ♠ , , ( ♥ , }→ + [ inc-max , L14 ] { ( ores , ♦ ) } | { ( ♦ , } The execution is quite straightforward. Recall that every variable is a pointerand holds just an address. Most of the data is stored in the heap.
B Complete Rules for Translation from LabeledStatements to CHCs
We present below the complete rules for ( | L : S | ) Π,f . ( | L : let y = mutbor α x ; goto L (cid:48) | ) Π,f := (cid:26) ∀ ( ∆ Π,f,L + { ( x ◦ , ( | T | )) } ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x, x ◦ (cid:105) /y, (cid:104) x ◦ (cid:105) /x ] (cid:27) (Ty Π,f,L ( x ) = own T ) (cid:26) ∀ ( ∆ Π,f,L + { ( x ◦ , ( | T | )) } ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x, x ◦ (cid:105) /y, (cid:104) x ◦ , ◦ x (cid:105) /x ] (cid:27) (Ty Π,f,L ( x ) = mut α T )( | L : drop x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:9) (Ty
Π,f,L ( x ) = ˇ P T ) (cid:26) ∀ ( ∆ Π,f,L −{ ( x, mut ( | T | )) } + { ( x ∗ , ( | T | )) } ) . ˇ ϕ Π,f,L [ (cid:104) x ∗ , x ∗ (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:27) (Ty
Π,f,L ( x ) = mut α T )( | L : immut x ; goto L (cid:48) | ) Π,f := (cid:26) ∀ ( ∆ Π,f,L −{ ( x, mut ( | T | )) } + { ( x ∗ , ( | T | )) } ) . ˇ ϕ Π,f,L [ (cid:104) x ∗ , x ∗ (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) x ∗ (cid:105) /x ] (cid:27) (Ty Π,f,L ( x ) = mut α T )( | L : swap ( ∗ x, ∗ y ); goto L (cid:48) | ) Π,f := (cid:40) { ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ y, ◦ x (cid:105) /x, (cid:104)∗ x (cid:105) /y ] } (Ty Π,f,L ( y ) = own T ) (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ y, ◦ x (cid:105) /x, (cid:104)∗ x, ◦ y (cid:105) /y ] (cid:9) (Ty Π,f,L ( y ) = mut α T )( | L : let ∗ y = x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) x (cid:105) /y ] (cid:9) ( | L : let y = ∗ x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ ∗ x/y ] (cid:9) (Ty Π,f,L ( x ) = own P T ) (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗∗ x (cid:105) /y ] (cid:9) (Ty Π,f,L ( x ) = immut α P T ) { ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗∗ x, ∗◦ x (cid:105) /y ] } (Ty Π,f,L ( x ) = mut α own T ) (cid:26) ∀ ( ∆ Π,f,L −{ ( x, mut box ( | T | )) } + { ( x ∗ , box ( | T | )) } ) . ˇ ϕ Π,f,L [ (cid:104) x ∗ , x ∗ (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L (cid:48) [ x ∗ /y ] (cid:27) (Ty Π,f,L ( x ) = mut α immut β T ) ∀ ( ∆ Π,f,L −{ ( x, mut mut ( | T | )) } + { ( x ∗∗ , ( | T | )) , ( x ∗◦ , ( | T | )) , ( x ◦∗ , ( | T | )) } ) . ˇ ϕ Π,f,L [ (cid:104)(cid:104) x ∗∗ , x ∗◦ (cid:105) , (cid:104) x ◦∗ , x ∗◦ (cid:105)(cid:105) /x ] ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) x ∗∗ , x ◦∗ (cid:105) /y ] (Ty Π,f,L ( x ) = mut α mut β T )( | L : let ∗ y = copy ∗ x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x (cid:105) /y ] (cid:9) ( | L : x as T ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:9) ( | L : let y = g (cid:104)· · ·(cid:105) ( x , . . . , x n − ); goto L (cid:48) | ) Π,f := {∀ ( ∆ Π,f,L + { ( y, ( | Ty Π,f,L (cid:48) ( y ) | )) } ) . ˇ ϕ Π,f,L ⇐ = g entry ( x , . . . , x n − , y ) ∧ ˇ ϕ Π,f,L (cid:48) } ( | L : return x | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L [ x/ res ] ⇐ = (cid:62) (cid:9) ( | L : intro α ; goto L (cid:48) | ) Π,f = ( | L : now α ; goto L (cid:48) | ) Π,f = ( | L : α ≤ β ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:9) ( | L : let ∗ y = const ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) const (cid:105) /y ] (cid:9) ustHorn: CHC-based Verification for Rust Programs (full version) 37( | L : let ∗ y = ∗ x op ∗ x (cid:48) ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104)∗ x op ∗ x (cid:48) (cid:105) /y ] (cid:9) ( | L : let ∗ y = rand (); goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L (cid:48) ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) (cid:9) ( | L : let ∗ y = inj T + T i ∗ x ; goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) inj i ∗ x (cid:105) /y ] (cid:9) ( | L : match ∗ x { inj ∗ y → goto L , inj ∗ y → goto L }| ) Π,f := (cid:40) ∀ ( ∆ Π,f,L i − { ( x, mut (( | T | )+( | T | ))) } + { ( x ∗ ! , ( | T i | )) } ) . ˇ ϕ Π,f,L [ (cid:104) inj i x ∗ ! (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L i [ (cid:104) x ∗ ! (cid:105) /y i ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) i ∈ [2] (cid:41) if Ty Π,f,L ( x ) = ˇ P ( T + T )( | L : match ∗ x { inj ∗ y → goto L , inj ∗ y → goto L }| ) Π,f := (cid:40) ∀ ( ∆ Π,f,L i − { ( x, mut (( | T | )+( | T | ))) } + { ( x ∗ ! , ( | T i | )) , ( x ◦ ! , ( | T i | )) } ) . ˇ ϕ Π,f,L [ (cid:104) inj i x ∗ ! , inj i x ◦ i (cid:105) /x ] ⇐ = ˇ ϕ Π,f,L i [ (cid:104) x ∗ ! , x ◦ ! (cid:105) /y i ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) i ∈ [2] (cid:41) if Ty Π,f,L ( x ) = mut α ( T + T )( | L : let ∗ y = ( ∗ x , ∗ x ); goto L (cid:48) | ) Π,f := (cid:8) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) ( ∗ x , ∗ x ) (cid:105) /y ] (cid:9) ( | L : let ( ∗ y , ∗ y ) = ∗ x ; goto L (cid:48) | ) Π,f := (cid:26) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ = ˇ ϕ Π,f,L (cid:48) [ (cid:104) ( ∗ x ) . (cid:105) /y , (cid:104) ( ∗ x ) . (cid:105) /y ] (cid:27) (Ty Π,f,L ( x ) = ˇ P T ) (cid:26) ∀ ( ∆ Π,f,L ) . ˇ ϕ Π,f,L ⇐ =ˇ ϕ Π,f,L (cid:48) [ (cid:104) ( ∗ x ) . , ( ◦ x ) . (cid:105) /y , (cid:104) ( ∗ x ) . , ( ◦ x ) . (cid:105) /y ] (cid:27) (Ty Π,f,L ( x ) = mut α T ) Rule for Dereference.
The rule for dereference ( let y = ∗ x ) may seem com-plicated at a glance. It is however just because this single instruction can causemultiple events (dereference and release of a mutable reference). C Proof of the Correctness of the CHC Representation
C.1 Abstract Operational Semantics
We introduce abstract operation semantics for COR, as a mediator betweenconcrete operational semantics and the logic. In abstract operational semantics,we get rid of heaps and directly represent each variable as a value with suchfuture values expressed as abstract variables x (marked bold and light blue),which is strongly related to prophecy variables . An abstract variable representsthe undetermined value of a mutable reference at the end of borrow.Formally, we introduce a pre-value , which is defined as follows: (pre-value) ˆ v, ˆ w ::= (cid:104) ˆ v (cid:105) | (cid:104) ˆ v ∗ , ˆ v ◦ (cid:105) | inj i ˆ v | (ˆ v , ˆ v ) | const | x . Abstract operational semantics is described as transition on program statesencoded as an abstract configuration C , which is defined as follows. Here, an abstract stack frame F maps variables to pre-values. We may omit the terminator‘; end’. S ::= end (cid:12)(cid:12) [ f, L ] Θ x, F ; S (abstract configuration) C ::= [ f, L ] Θ F ; S | A In order to facilitate proofs later, we append lifetime-related ghost informa-tion to C , which does not directly affect the execution. A is a global lifetimecontext , which is the lifetime context of all local lifetime variables from all stackframes; we add a tag on a local lifetime variable (e.g. α ( i ) instead of α ) to clar-ify which stack frame it belongs to. Θ is a lifetime parameter context , whichmaps the lifetime variables in the (local) lifetime context for a stack frame tothe corresponding tagged lifetime variables in the global lifetime context.Just as concrete operational semantics, abstract operational semantics ischaracterized by the one-step transition relation C → Π C (cid:48) and the termina-tion relation final Π ( C ), which are defined by the following rules. C [ˆ v/ x ] is C withevery x in its abstract stack frames replaced with ˆ v . ‘val’ maps both (cid:104) ˆ v (cid:105) and (cid:104) ˆ v, x ◦ (cid:105) to ˆ v . S Π,f,L = let y = mutbor α x ; goto L (cid:48) x ◦ is fresh[ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) , ( x, (cid:104) x ◦ (cid:105) ) } ; S | A S Π,f,L = let y = mutbor α x ; goto L (cid:48) x ◦ is fresh[ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ , x (cid:48)◦ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) , ( x, (cid:104) x ◦ , x (cid:48)◦ (cid:105) ) } ; S | A S Π,f,L = drop x ; goto L (cid:48) Ty Π,f,L ( x ) = ˇ P T [ f, L ] Θ F + { ( x, ˆ v ) } ; S | A → Π [ f, L (cid:48) ] Θ F ; S | A S Π,f,L = drop x ; goto L (cid:48) Ty Π,f,L ( x ) = mut α T [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F ; S | A (cid:1)(cid:2) ˆ v ∗ / x ◦ (cid:3) S Π,f,L = immut x ; goto L (cid:48) [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F + { ( x, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A (cid:1)(cid:2) ˆ v ∗ / x ◦ (cid:3) S Π,f,L = swap ( ∗ x, ∗ y ); goto L (cid:48) Ty Π,f,L ( y ) = own T [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) , ( y, (cid:104) ˆ w ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( x, (cid:104) ˆ w ∗ , x ◦ (cid:105) ) , ( y, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A S Π,f,L = swap ( ∗ x, ∗ y ); goto L (cid:48) Ty Π,f,L ( y ) = mut α T [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ , x ◦ (cid:105) ) , ( y, (cid:104) ˆ w ∗ , y ◦ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( x, (cid:104) ˆ w ∗ , x ◦ (cid:105) ) , ( y, (cid:104) ˆ v ∗ , y ◦ (cid:105) ) } ; S | A S Π,f,L = let ∗ y = x ; goto L (cid:48) [ f, L ] Θ F + { ( x, ˆ v ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v (cid:105) ) } ; S | A S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = own P T [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, ˆ v ∗ ) } ; S | A S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = immut α P T [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) val(ˆ v ∗ ) (cid:105) ) } ; S | A S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = mut α own T x ◦∗ is fresh[ f, L ] Θ F + { ( x, (cid:104)(cid:104) ˆ v ∗∗ (cid:105) , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v ∗∗ , x ◦∗ (cid:105) ) } ; S | A (cid:1)(cid:2) (cid:104) x ◦∗ (cid:105) / x ◦ (cid:3) S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = mut α immut β T [ f, L ] Θ F + { ( x, (cid:104)(cid:104) ˆ v ∗∗ (cid:105) , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v ∗∗ (cid:105) ) } ; S | A (cid:1)(cid:2) (cid:104) ˆ v ∗∗ (cid:105) / x ◦ (cid:3) ustHorn: CHC-based Verification for Rust Programs (full version) 39 S Π,f,L = let y = ∗ x ; goto L (cid:48) Ty Π,f,L ( x ) = mut α mut β T x ∗◦ is fresh[ f, L ] Θ F + { ( x, (cid:104)(cid:104) ˆ v ∗∗ , x (cid:48)∗◦ (cid:105) , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F + { ( y, (cid:104) ˆ v ∗∗ , x ∗◦ (cid:105) ) } ; S | A (cid:1)(cid:2) (cid:104) x ∗◦ , x (cid:48)∗◦ (cid:105) / x ◦ (cid:3) S Π,f,L = let ∗ y = copy ∗ x ; goto L (cid:48) [ f, L ] Θ F ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) val( F ( x )) (cid:105) ) } ; S | A S Π,f,L = x as T ; goto L (cid:48) [ f, L ] Θ F ; S | A → Π [ f, L (cid:48) ] Θ F ; S | A S Π,f,L = let y = g (cid:104) α , . . . , α m − (cid:105) ( x , . . . , x n − ); goto L (cid:48) Σ Π,g = (cid:104) α (cid:48) , . . . , α (cid:48) m − | · · ·(cid:105) ( x (cid:48) : T , . . . , x (cid:48) n − : T n − ) Θ (cid:48) = { ( α (cid:48) j , α j Θ ) | j ∈ [ m ] } [ f, L ] Θ F + { ( x i , ˆ v i ) | i ∈ [ n ] } ; S | A → Π [ g, entry ] Θ (cid:48) { ( x (cid:48) i , ˆ v i ) | i ∈ [ n ] } ; [ f, L (cid:48) ] Θ y, F ; S | A S Π,f,L = return x [ f, L ] Θ { ( x, ˆ v ) } ; [ g, L (cid:48) ] Θ (cid:48) x (cid:48) , F (cid:48) ; S | A → Π [ g, L (cid:48) ] Θ (cid:48) F (cid:48) + { ( x (cid:48) , ˆ v ) } ; S | A S Π,f,L = return x final Π (cid:0) [ f, L ] Θ { ( x, ˆ v ) } | A (cid:1) S Π,f,L = intro α ; goto L (cid:48) S has n layers A ex = { α ( k ) ∈ A | k < n } [ f, L ] Θ F ; S | ( A,R ) → Π [ f, L (cid:48) ] Θ + { ( α,α ( n ) ) } F ; S | ( { α ( n ) } + A, { α ( n ) }× ( { α ( n ) } + A ex )+ R ) S Π,f,L = now α ; goto L (cid:48) [ f, L ] { ( α,α ( n ) ) } + Θ F ; S | ( { α ( n ) } + A,R ) → Π [ f, L (cid:48) ] Θ F ; S | ( A, { ( β ( k ) ,γ ( l ) ) ∈ R | β ( k ) (cid:54) = α ( n ) } ) S Π,f,L = α ≤ β ; goto L (cid:48) [ f, L ] Θ F ; S | ( A,R ) → Π [ f, L (cid:48) ] Θ F ; S | ( A, ( { ( Θ ( α ) , Θ ( β )) } + R ) + ) S Π,f,L = let ∗ y = const ; goto L (cid:48) [ f, L ] Θ F ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) const (cid:105) ) } ; S | A S Π,f,L = let ∗ y = ∗ x op ∗ x (cid:48) ; goto L (cid:48) [ f, L ] Θ F ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) val( F ( x )) [[ op ]] val( F ( x (cid:48) )) (cid:105) ) } ; S | A S Π,f,L = let ∗ y = rand (); goto L (cid:48) [ f, L ] Θ F ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) n (cid:105) ) } ; S | A S Π,f,L = let ∗ y = inj T + T i ∗ x ; goto L (cid:48) [ f, L ] Θ F + { ( x, (cid:104) ˆ v ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) inj i ˆ v ∗ (cid:105) ) } ; S | A S Π,f,L = match ∗ x { inj ∗ y → goto L (cid:48) , inj ∗ y → goto L (cid:48) } Ty Π,f,L ( x ) = ˇ P ( T + T )[ f, L ] Θ F + { ( x, (cid:104) inj i ˆ v ∗ ! (cid:105) ) } ; S | A → Π [ f, L (cid:48) i ] Θ F + { ( y i , (cid:104) ˆ v ∗ ! (cid:105) ) } ; S | A S Π,f,L = match ∗ x { inj ∗ y → goto L (cid:48) , inj ∗ y → goto L (cid:48) } Ty Π,f,L ( x ) = mut α ( T + T ) x ◦ ! is fresh[ f, L ] Θ F + { ( x, (cid:104) inj i ˆ v ∗ ! , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) i ] Θ F + { ( y i , (cid:104) ˆ v ∗ ! , x ◦ ! (cid:105) ) } ; S | A (cid:1)(cid:2) inj i x ◦ ! / x ◦ (cid:3) S Π,f,L = let ∗ y = ( ∗ x , ∗ x ); goto L (cid:48) [ f, L ] Θ F + { ( x , (cid:104) ˆ v ∗ (cid:105) ) , ( x , (cid:104) ˆ v ∗ (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y, (cid:104) (ˆ v ∗ , ˆ v ∗ ) (cid:105) ) } ; S | A S Π,f,L = let ( ∗ y , ∗ y ) = ∗ x ; goto L (cid:48) [ f, L ] Θ F + { ( x, (cid:104) (ˆ v ∗ , ˆ v ∗ ) (cid:105) ) } ; S | A → Π [ f, L (cid:48) ] Θ F + { ( y , (cid:104) ˆ v ∗ (cid:105) ) , ( y , (cid:104) ˆ v ∗ (cid:105) ) } ; S | A S Π,f,L = let ( ∗ y , ∗ y ) = ∗ x ; goto L (cid:48) x ◦ , x ◦ are fresh[ f, L ] Θ F + { ( x, (cid:104) (ˆ v ∗ , ˆ v ∗ ) , x ◦ (cid:105) ) } ; S | A → Π (cid:0) [ f, L (cid:48) ] Θ F + { ( y , (cid:104) ˆ v ∗ , x ◦ (cid:105) ) , ( y , (cid:104) ˆ v ∗ , x ◦ (cid:105) ) } ; S | A (cid:1)(cid:2) ( x ◦ , x ◦ ) / x ◦ (cid:3) Example 6 (Execution on Abstract Operaitonal Semantics).
The following is anexample execution on abstract operational semantics for Example 1. It corre-sponds to Example 5, the example execution on concrete operational semantics.Here, A := ( { α } , Id { α } ) and Θ := { α, α (0) } . [ inc-max , entry ] ∅ { ( oa , (cid:104) (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | ( ∅ , ∅ ) → [ inc-max , L1 ] Θ { ( oa , (cid:104) (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → + [ inc-max , L3 ] Θ { ( ma , (cid:104) , a ◦ (cid:105) ) , ( mb , (cid:104) , b ◦ (cid:105) ) , ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) b ◦ (cid:105) ) } | A → [ take-max , entry ] Θ { ( ma , (cid:104) , a ◦ (cid:105) ) , ( mb , (cid:104) , b ◦ (cid:105) ) } ;[ inc-max , L4 ] Θ mc , { ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) b ◦ (cid:105) ) } | A → [ take-max , L1 ] Θ { ( ord , (cid:104) inj () (cid:105) ) , ( ma , (cid:104) , a ◦ (cid:105) ) , ( mb , (cid:104) , b ◦ (cid:105) ) } ;[ inc-max , L4 ] Θ mc , { ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) b ◦ (cid:105) ) } | A → [ take-max , L2 ] Θ { ( ou , (cid:104) () (cid:105) ) , ( ma , (cid:104) , a ◦ (cid:105) ) , ( mb , (cid:104) , b ◦ (cid:105) ) } ;[ inc-max , L4 ] Θ mc , { ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) b ◦ (cid:105) ) } | A → + [ take-max , L4 ] Θ { ( ma , (cid:104) , a ◦ (cid:105) ) } ;[ inc-max , L4 ] Θ mc , { ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → [ inc-max , L4 ] Θ { ( mc , (cid:104) , a ◦ (cid:105) ) , ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → [ inc-max , L5 ] Θ { ( o1 , (cid:104) (cid:105) ) , ( mc , (cid:104) , a ◦ (cid:105) ) , ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → + [ inc-max , L7 ] Θ { ( oc (cid:48) , (cid:104) (cid:105) ) , ( mc , (cid:104) , a ◦ (cid:105) ) , ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → [ inc-max , L8 ] Θ { ( oc (cid:48) , (cid:104) (cid:105) ) , ( mc , (cid:104) , a ◦ (cid:105) ) , ( oa , (cid:104) a ◦ (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → + [ inc-max , L10 ] Θ { ( oa , (cid:104) (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | A → [ inc-max , L11 ] ∅ { ( oa , (cid:104) (cid:105) ) , ( ob , (cid:104) (cid:105) ) } | ( ∅ , ∅ ) → + [ inc-max , L14 ] ∅ { ( or , (cid:104) inj () (cid:105) ) } | ( ∅ , ∅ ) The abstract variables a ◦ and b ◦ are introduced for mutable borrow of oa and ob . By the call of take-max , mb is released, whereby the variable b ◦ is set to thevalue 3, and the variable a ◦ is passed to mc . After the increment is performed, mc is released, and thereby a ◦ is set to the updated value 5. C.2 Safety on Abstract Configurations
It is natural to require for an abstract configuration that each variable is sharedby the borrower and the lender and is not used elsewhere. A stack of borrows(caused by reborrows) can be described as a chain of abstract variables (e.g. (cid:104) v, x (cid:105) , (cid:104) x , y (cid:105) , (cid:104) y (cid:105) ).To describe such restrictions, we define the safety on an abstract configura-tion ‘safe Π ( C )’. We also show progression and preservation regarding safety on abstract operational semantics , as a part of soundness of COR’s type system. We should take care of the cases where a mutable reference is immutably borrowed(e.g. immut α mut β T ), because immutable references can be unrestrictedly copied.Later when we define ‘summary‘ judgments, we get over this problem using accessmodes .ustHorn: CHC-based Verification for Rust Programs (full version) 41 Summary. An abstract variable summary X is a finite multiset of items of form‘give α ( x :: T )’ or ‘take α ( x :: T )’.Now, ‘summary a D (ˆ v :: T | X )’ (the pre-value ˆ v of type T yields an abstractvariable summary X , under the access mode D and the activeness a ) is definedas follows. Here, an access mode D is either of form ‘hot’ or ‘cold’. summary † αD ( x :: T | { take α ( x :: T ) } ) summary a D · ˇ P (ˆ v :: T | X )summary a D ( (cid:104) ˆ v (cid:105) :: ˇ P T | X ) D · own := D D · immut β := coldsummary a hot (ˆ v :: T | X )summary a hot ( (cid:104) ˆ v, x (cid:105) :: mut β T | X ⊕ { give β ( x :: T ) } ) summary a cold (ˆ v :: T | X )summary a cold ( (cid:104) ˆ v, ˆ w (cid:105) :: mut β T | X )summary a D (ˆ v :: T [ µX.T /X ] | X )summary a D (ˆ v :: µX.T /X | X ) summary a D ( const :: T | ∅ )summary a D (ˆ v :: T i | X )summary a D (cid:0) inj i ˆ v :: T + T (cid:12)(cid:12) X (cid:1) summary a D (ˆ v :: T | X ) summary a D (ˆ v :: T | X )summary a D (cid:0) (ˆ v , ˆ v ) :: T × T (cid:12)(cid:12) X ⊕ X (cid:1) ‘summary Θ ( F :: Γ | X )’ (the abstract stack frame F respecting the variablecontext Γ yields X , under the lifetime parameter context Θ ) is defined as follows. dom F = dom Γ for any x : a T ∈ Γ , summary a hot (cid:0) F ( x ) :: T Θ | X x (cid:1) summary Θ (cid:0) F :: Γ (cid:12)(cid:12) (cid:76) x : a T ∈ Γ X x (cid:1) Finally, ‘summary Π ( C | X )’ (the abstract configuration C yields X under theprogram Π ) is defined as follows. for any i ∈ [ n + 1] , summary Θ i ( F i :: Γ Π,f i ,L i | X i )summary Π (cid:0) [ f , L ] Θ F ; [ f , L ] Θ x , F ; · · · ; [ f n , L n ] Θ n x n , F n | A (cid:12)(cid:12) (cid:76) ni =0 X i (cid:1) Lifetime Safety. ‘lifetimeSafe i ( A global , Θ | A local , A ex )’ (the global lifetimecontext A global with the lifetime parameter context Θ is safe on lifetimes withrespect to the (local) lifetime context A local from the type system and the set oflifetime parameters A ex under the stack frame index i ) is defined as follows. dom Θ = | A local | for any α ∈ A ex , letting β ( k ) = Θ ( α ) , k < i holdsfor any α ∈ | A local |− A ex , Θ ( α ) = α ( i ) for any ( α, β ) ∈ | A local | − A , α ≤ A local β ⇐⇒ Θ ( α ) ≤ A global Θ ( β )for any α, β ∈ A , α ≤ A local β = ⇒ Θ ( α ) ≤ A global Θ ( β )lifetimeSafe i ( A global , Θ | A local , A ex ) ‘lifetimeSafe Π (cid:0) A global , ( f i , L i , Θ i ) ni =0 (cid:1) ’ ( A global with the finite sequence offunction names, labels and lifetime parameter contexts ( f i , L i , Θ i ) ni =0 is safe onlifetimes under the program Π ) is defined as follows. for any i ∈ [ n +1] , lifetimeSafe i ( A global , Θ i | A Π,f i ,L i , A ex Π,f i )card | A global | = (cid:80) ni =0 card ( | A Π,f i ,L i |− A ex Π,f i )lifetimeSafe Π (cid:0) A global , ( f i , L i , Θ i ) ni =0 (cid:1) A Π,f,L : the lifetime context for the label L of f in Π card X : the cardinality of X Finally, ‘lifetimeSafe Π ( C )’ (the abstract configuration C is safe on lifetimesunder the program Π ) is defined as follows. lifetimeSafe Π (cid:0) A global , ( f i , L i , Θ i ) ni =0 (cid:1) lifetimeSafe Π (cid:0) [ f n , L n ] Θ n F n ; [ f n − , L n − ] Θ n − x n − , F n − ; · · · ; [ f , L ] Θ x , F | A global (cid:1) Safety.
We first define the safety on abstract variable summaries. ‘safe A ( x , X )’is defined as follows. Here, T ∼ A U means T ≤ A U ∧ U ≤ A T (the typeequivalence ). X ( x ) = {| give α ( x :: T ) , take β ( x :: T (cid:48) ) |} T ∼ A T (cid:48) α ≤ A β safe A ( x , X ) X ( x ) = ∅ safe A ( x , X ) X ( x ): the multiset of the items of form ‘give γ ( x :: U )’/‘take γ ( x :: U )’ in X ‘safe A ( X )’ means that safe A ( x , X ) holds for any x .Finally, ‘safe Π ( C )’ is defined as follows. summary Π ( C | X ) lifetimeSafe Π ( C ) C = · · · | A safe A ( X )safe Π ( C ) Property 1 (Safety on an Abstract Configuration Ensures Progression).
For any Π and C such that safe Π ( C ) holds and final Π ( C ) does not hold, there exists C (cid:48) satisfying C → Π C (cid:48) . Proof.
Clear. The important guarantee the safety on an abstract configurationprovides is that, in the pre-value assigned to each active variable, abstract vari-ables do not appear except in the form (cid:104) ˆ v, x (cid:105) . (cid:117)(cid:116) Lemma 1 (Safety on the Abstract Configuration is Preserved).
For any Π and C , C (cid:48) such that safe Π ( C ) and C → Π C (cid:48) hold, safe Π ( C (cid:48) ) is satisfied.Proof. Straightforward. Preservation of safety on the abstract variable summaryis the point. Below we check some tricky cases.
Type Weakening.
Type weakening ( x as T ) essentially only changes lifetimes ontypes. A lifetime on a type can become earlier if it is not guarded by any mut α .Thus only the following changes happen on the abstract variable summary: (i)for an item of form ‘give α ( x :: T )’, α can get earlier and T can be weakened; and(ii) for an item of form ‘take α ( x :: T )’, α do not change and T can be weakened. Mutable (Re)borrow.
When we perform let my = mutbor α px , the abstract vari-able summary just gets two new items ‘give α ( x ◦ :: T )’ and ‘take α ( x ◦ :: T )’, forsome x ◦ and T . Release of a Mutable Reference.
When we release a mutable reference mx , whosepre-value is of form (cid:104) ˆ v, x ◦ (cid:105) , only the following changes happen on the abstractvariable summary: (i) the items of form ‘give α ( x ◦ :: T )’ and ‘take β ( x ◦ :: T (cid:48) )’ areremoved; and (ii) since ˆ v moves to another variable, the type of each abstractvariable in ˆ v may change into an equivalent type. ustHorn: CHC-based Verification for Rust Programs (full version) 43 Ownership Weakening.
Similar to a release of a mutable reference.
Swap.
Swap ( swap ( ∗ x, ∗ y )) actually does not alter the abstract variable sum-mary. Copying.
When data of type T is copied, T : copy holds, which ensures thateach mutable reference mut α U in T is guarded by some immutable reference.Therefore the abstract variable summary does not change. Subdivision of a Mutable Reference.
A mutable reference is subdivided in thefollowing forms: pair destruction ‘ let ( ∗ mx , ∗ mx ) = ∗ mx ’, variant destruction‘ match ∗ mx { inj ∗ my → goto L , · · ·} ’, and dereference ‘ let mx = ∗ mpx ’. Whena mutable reference mx with a pre-value (cid:104) ˆ v, x (cid:105) is subdivided, the two items ofform give α ( x :: T ) and take β ( x :: T (cid:48) ) are accordingly ‘subdivided’ in the abstractvariable summary. With a close look, the safety turns out to be preserved. Elimination of a Local Lifetime Variable.
Just after we eliminate a local life-time variable α (‘ now α ’), since there remains no lifetime variable earlier than α in the lifetime context, the abstract variable summary has no item of form‘give α ( n ) ( x :: T )’ (for appropriate n ). Therefore, just before (and just after)the lifetime elimination, the abstract variable summary has no item of form‘take α ( n ) ( x :: T (cid:48) )’. (cid:117)(cid:116) C.3 SLDC Resolution
For CHC representation of a COR program, we introduce a variant of SLD resolu-tion, which we call
SLDC resolution (Selective Linear Definite clause Calculativeresolution). Interpreting each CHC as a deduction rule, SLDC resolution can beunderstood as a top-down construction of a proof tree from the left-hand side.SLDC resolution is designed to be complete with respect to the logic (Lemma 2).A resolutive configuration K and a pre-resolutive configuration ˆ K have thefollowing form. (resolutive configuration) K ::= ˇ ϕ , . . . , ˇ ϕ n − | q (pre-resolutive configuration) ˆ K ::= ϕ , . . . , ϕ n − | q The elementary formulas in a resolutive configuration can be understood as amodel of a call stack . q is a pattern that represents the returned value . This ideais later formalized in Appendix C.4. K → ( Φ , Ξ ) K (cid:48) ( K can change into K (cid:48) by one step of SLDC resolution on( Φ , Ξ )) is defined by the following non-deterministic transformation from K to K (cid:48) .1. The ‘stack’ part of K should be non-empty. Let K = f ( p , . . . , p m − ) , ˇ ϕ , . . . , ˇ ϕ n | q .Take from Φ any CHC that unifies with the head of the stack of K . Thatis, Φ is of form ∀ x : σ , . . . , x l − : σ l − . f ( p (cid:48) , . . . , p (cid:48) m − ) ⇐ = ψ ∧ · · · ∧ ψ k − and p (cid:48) , . . . , p (cid:48) m − unify with p , . . . , p m − . Let us take the most general unifier ( θ, θ (cid:48) ) such that p θ = p (cid:48) θ (cid:48) , . . . , p m − θ = p (cid:48) m − θ (cid:48) hold. Here, θ mapsvariables to patterns.Now we have a pre-resolutive configuration ˆ K = ψ (cid:48) , . . . , ψ (cid:48) k − , ˇ ϕ (cid:48) , . . . , ˇ ϕ (cid:48) n | q (cid:48) ,where ψ (cid:48) i := ψ i θ (cid:48) , ˇ ϕ (cid:48) j := ˇ ϕ j θ and q (cid:48) := qθ .2. We ‘calculate’ ˆ K into a resolutive configuration. That is, we repeat the fol-lowing operations to update ( ˆ K until ψ (cid:48) , . . . , ψ (cid:48) k − all become elementary. K (cid:48) is set to the final version of ˆ K . – We substitute variables conservatively until there do not remain termsof form ∗ x, ◦ x, x.i, x op t/t op x ; for each case, we replace x with (cid:104) x ∗ (cid:105) / (cid:104) x ∗ , x ◦ (cid:105) (depending on the sort), (cid:104) x ∗ , x ◦ (cid:105) , ( x , x ), n , taking freshvariables. – We replace each ∗(cid:104) t ∗ (cid:105) / ∗(cid:104) t ∗ , t ◦ (cid:105) , ◦(cid:104) t ∗ , t ◦ (cid:105) , ( t , t ) .i, n op n (cid:48) with t ∗ , t ◦ , t i ,n [[ op ]] n (cid:48) . – If there exists a variable x that occurs only once in the pre-resolutiveconfiguration ˆ K , then replace it with any value of the suitable sort. We have carefully designed SLDC resolution to match it with abstract opera-tional semantics, which assists the proof of Theorem 2.
Lemma 2 (Completeness of SLDC Resolution).
For any ( Φ , Ξ ) and f ∈ dom Ξ , the following are equivalent for any values v , . . . , v n − , w of the appro-priate sorts.1. M least( Φ , Ξ ) ( f )( v , . . . , v n − , w ) holds.2. There exists a sequence K , . . . , K N such that K = f ( v , . . . , v n − , r ) | r , K N = | p , K → ( Φ , Ξ ) · · · → ( Φ , Ξ ) K N and p can be refined into w by instan-tiating variables.Proof. Clear by thinking of derivation trees (which can be defined in a naturalmanner) on CHC system ( Φ , Ξ ). (cid:117)(cid:116) C.4 Equivalence of the AOS-based Model and the CHCRepresentation
We first show a bisimulation between abstract operational semantics and SLDCresolution (Lemma 3). Using the bisimulation, we can easily show the equivalenceof the AOS-based model and (the least model of) the CHC representation.
Bisimulation Lemma.
Interestingly, there is a bisimulation between the tran-sition system of abstract operational semantics and the process of SLDC resolu-tion. F (cid:32) θf,L, r ˇ ϕ (the abstract stack frame F can be translated into the elementaryformula ˇ ϕ , under θ , f , L and r ) is defined as follows. Here, θ maps abstract We use this peculiar rule to handle the ‘ let ∗ y = rand ()’ instruction later for Lemma 3.ustHorn: CHC-based Verification for Rust Programs (full version) 45 variables to (normal) variables. ˆ vθ is the value made from ˆ v by replacing each x with θ ( x ). r is the abstract variable for taking the result. the items of F are enumerated as ( x , ˆ v ) , . . . , ( x n − , ˆ v n − ) F (cid:32) θf,L, r f L (ˆ v θ , . . . , ˆ v n − θ, r θ ) Now, C (cid:32) Π K is defined as follows. safe Π ( C ) C = [ f , L ] Θ F ; [ f , L ] Θ x , F ; · · · ; [ f n , L n ] Θ n x n , F n | A r , . . . , r n are fresh in CF (cid:32) θf ,L , r ˇ ϕ for any i ∈ [ n ] , F i +1 + { ( x i +1 , r i ) } (cid:32) θf i +1 ,L i +1 , r i +1 ˇ ϕ i +1 C (cid:32) Π ˇ ϕ , . . . , ˇ ϕ n | θ ( r n ) Lemma 3 (Bisimulation between Abstract Operational Semantics andSLDC Resolution).
Take any Π , C and K satisfying C (cid:32) Π K .For any C (cid:48) satisfying C → Π C (cid:48) , there exists some K (cid:48) satisfying K → ( | Π | ) K (cid:48) and C (cid:48) (cid:32) Π K (cid:48) . Likewise, for any K (cid:48) satisfying K → ( | Π | ) K (cid:48) , there exists some C (cid:48) satisfying C → Π C (cid:48) and C (cid:48) (cid:32) Π K (cid:48) .Proof. Straightforward. (cid:117)(cid:116)
AOS-based Model and the Equivalence Theorem.
Take any Π and simple f . The AOS-based model (AOS stands for abstract operational semantics) for f ,denoted by f AOS , is the predicate defined by the following rule. C → Π · · · → Π C N final Π ( C N ) safe Π ( C ) C = [ f, entry ] ∅ { ( x i , v i ) | i ∈ [ n ] } | ( ∅ , ∅ ) C N = [ f, L (cid:48) ] ∅ { ( y, w ) } | ( ∅ , ∅ ) f AOS Π ( v , . . . , v n − , w ) Now we can prove the following theorem.
Theorem 2 (Equivalence of the AOS-based Model and the CHC Rep-resentation).
For any Π and simple f in Π , f AOS Π is equivalent to M ( | Π | ) ( f entry ) .Proof. Clear from completeness of SLDC resolution (Lemma 2) and the bisimu-lation between abstract operational semantics and SLDC resolution (Lemma 3). (cid:117)(cid:116)
C.5 Bisimulation between Concrete and Abstract OperationalSemantics
Extending ‘safe H ( F :: Γ | F )’ introduced in § safe readout ‘safe Π ( C | C )’ of an abstract configuration from a concrete configuration. In-terestingly, the safe readout is a bisimulation between concrete and abstractoperational semantics (Lemma 5). We also establish progression and preserva-tion regarding the safe readout, as a part of soundness of COR’s type systemin terms of concrete operational semantics , extending the soundness shown forabstract operational semantics in Appendix C.2. Auxiliary Notions. An extended abstract variable summary ˆ X is a finite mul-tiset of items of form ‘give α ( ∗ a ; x :: T )’ or ‘take α ( ∗ a ; x :: T )’, where a is an ad-dress. An extended access mode ˆ D is of form either ‘hot’ or ‘cold α ’. An extendedmemory footprint ˆ M is a finite multiset of items of form ‘hot a ( a )’ or ‘cold α ( a )’,where a is an address. Readout.
First, ‘readout aH , ˆ D ( a :: T | ˆ v ; ˆ X , ˆ M )’ and ‘readout aH , ˆ D ( ∗ a :: T | ˆ v ; ˆ X , ˆ M )’(the pointer of the address a / the data at a , typed T , can be read out from theheap H as a pre-value ˆ v , yielding an extended abstract variable summary ˆ X andan extended memory footprint ˆ M , under the extended access mode ˆ D and theactiveness a ) are defined by the following rules. readout aH , ˆ D ◦ ˇ P ( ∗ a :: T | ˆ v ; ˆ X , ˆ M )readout aH , ˆ D (cid:0) a :: ˇ P T (cid:12)(cid:12) (cid:104) ˆ v (cid:105) ; ˆ X , ˆ M (cid:1) ˆ D ◦ own := ˆ D hot ◦ immut β := cold β cold α ◦ immut β := cold α readout aH , hot ( ∗ a :: T | ˆ v ; ˆ X , ˆ M )readout aH , hot (cid:0) a :: mut β T (cid:12)(cid:12) (cid:104) ˆ v, x (cid:105) ; ˆ X ⊕{| give β ( ∗ a ; x :: T ) |} , ˆ M (cid:1) readout aH , cold β ( ∗ a :: T | ˆ v ; ˆ X , ˆ M )readout aH , cold β (cid:0) a :: mut β T (cid:12)(cid:12) (cid:104) ˆ v, ˆ w (cid:105) ; ˆ X , ˆ M (cid:1) readout † α H , ˆ D ( ∗ a :: T | x ; {| take α ( ∗ a ; x :: T ) |} , ∅ ) H ( a ) = a (cid:48) readout aH , ˆ D ( a (cid:48) :: P T | ˆ v ; ˆ X , ˆ M )readout aH , ˆ D ( ∗ a :: P T | ˆ v ; ˆ X , ˆ M⊕{| ˆ D a ( a ) |} )ˆ D a ( a ) := (cid:26) hot a ( a ) ( ˆ D = hot)cold β ( a ) ( ˆ D = cold β )readout aH , ˆ D ( ∗ a :: T [ µX.T /X ] | ˆ v ; ˆ X , ˆ M )readout aH , ˆ D ( ∗ a :: µX.T | ˆ v ; ˆ X , ˆ M ) H ( a ) = n readout aH , ˆ D ( ∗ a :: int | n ; ∅ , {| ˆ D a ( a ) |} ) readout aH , ˆ D ( ∗ a :: unit | (); ∅ , ∅ ) H ( a ) = i ∈ [2] readout aH , ˆ D ( ∗ ( a +1) :: T i | ˆ v ; ˆ X , ˆ M ) n = ( T − i − T i ) ≥ for any k ∈ [ n ] , H ( a +1+ T i + k ) = 0 ˆ M = {| ˆ D a ( a +1+ T i + k ) | k ∈ [ n ] |} readout aH , ˆ D (cid:0) ∗ a :: T + T (cid:12)(cid:12) inj i ˆ v ; ˆ X , ˆ M⊕{| ˆ D a ( a ) |}⊕ ˆ M (cid:1) readout aH , ˆ D (cid:0) ∗ a :: T (cid:12)(cid:12) ˆ v ; ˆ X , ˆ M (cid:1) readout aH , ˆ D (cid:0) ∗ ( a + T ) :: T (cid:12)(cid:12) ˆ v ; ˆ X , ˆ M (cid:1) readout aH , ˆ D (cid:0) ∗ a :: T × T (cid:12)(cid:12) (ˆ v , ˆ v ); ˆ X ⊕ ˆ X , ˆ M ⊕ ˆ M (cid:1) Next, ‘readout H , Θ ( F :: Γ | F ; ˆ X , ˆ M )’ (the stack frame F respecting the vari-able context Γ can be read out from H as an abstract stack frame F , yielding ustHorn: CHC-based Verification for Rust Programs (full version) 47 ˆ X and ˆ M , under the lifetime parameter context Θ ) is defined as follows. dom F = dom Γ for any x : a T ∈ Γ , readout aH , hot ( F ( x ) :: T Θ | ˆ v x ; ˆ X x , ˆ M x )readout H , Θ (cid:0) F :: Γ (cid:12)(cid:12) { ( x, ˆ v x ) | x ∈ dom Γ } ; (cid:76) x ∈ dom Γ ˆ X x , (cid:76) x ∈ dom Γ ˆ M x (cid:1) Finally, ‘readout Π ( C | C ; ˆ X , ˆ M )’ (the data of the concrete configuration C can be read out as the abstract configuration C , yielding ˆ X and ˆ M , under theprogram Π ) is defined as follows. for any i ∈ [ n +1] , readout H , Θ i ( F i :: Γ Π,f i ,L i | F i ; ˆ X i , ˆ M i )readout Π (cid:0) [ f , L ] F ; [ f , L ] x , F ; · · · ; [ f n , L n ] x n , F n | H (cid:12)(cid:12) [ f , L ] Θ F ; [ f , L ] Θ x , F ; · · · ; [ f n , L n ] Θ n x n , F n | A ; (cid:76) ni =0 ˆ X i , (cid:76) ni =0 ˆ M i (cid:1) Safety.
We define the safety on extended abstract variable summaries and ex-tended memory footprints.‘safe A ( x , ˆ X )’ is defined as follows. ˆ X ( x ) = {| give α ( ∗ a ; x :: T ) , take β ( ∗ a ; x :: T (cid:48) ) |} T ∼ A T (cid:48) α ≤ A β safe A ( x , ˆ X ) ˆ X ( x ) = ∅ safe A ( x , ˆ X )ˆ X ( x ): the multiset of items of form ‘give γ ( ∗ b ; x :: U )’/‘take γ ( ∗ b ; x :: U )’ in ˆ X ‘safe A ( ˆ X )’ means that safe A ( x , ˆ X ) holds for any x .‘safe A ( a, ˆ M )’ is defined as follows. ˆ M ( a ) = { hot a ( a ) } safe A ( a, ˆ M ) ˆ M ( a ) = ∅ safe A ( a, ˆ M )ˆ M ( a ) = {| hot † α ( a ) , cold β ( a ) , . . . , cold β n − ( a ) |} for any i ∈ [ n ] , β i ≤ A α safe A ( a, ˆ M )ˆ M ( a ): the multiset of items of form hot a ( a )/cold α ( a ) in ˆ M ‘safe A ( ˆ M )’ means that safe A ( a, ˆ M ) holds for any address a . Safe Readout.
Finally, ‘safe Π ( C | C )’ (the data of the concrete configuration C can be safely read out as the abstract configuration C under Π ) is defined asfollows. readout Π ( C | C ; ˆ X , ˆ M ) lifetimeSafe Π ( C ) C = · · · | A safe A ( ˆ X ) safe A ( ˆ M )safe Π ( C | C ) ‘safe Π ( C )’ means that safe Π ( C | C ) holds for some C . Property 2 (Safety on a Concrete Configuration Ensures Progression).
For any Π and C such that safe Π ( C ) holds and final Π ( C ) does not hold, there existssome C (cid:48) satisfying C → Π C (cid:48) . Proof.
Clear. One important guarantee the safety provides is that the data isstored in the heap in an expected form. (cid:117)(cid:116)
Lemma 4 (Safe Readout Ensures Safety on the Abstract Configura-tion).
For Π , C and C such that safe Π ( C | C ) holds, safe Π ( C ) holds.Proof. By straightforward induction over the judgment deduction. Note thatsafety on a extended abstract variable summary is in fact an extension of safetyon an abstract variable summary. (cid:117)(cid:116)
Bisimulation Lemma.
The safe readout defined above is actually a bisimula-tion between concrete and abstract operational semantics.
Lemma 5 (Bisimulation between Concrete and Abstract OperationalSemantics).
Take any Π , C and C satisfying safe Π ( C | C ) .For any C (cid:48) satisfying C → Π C (cid:48) , there exists C (cid:48) satisfying C → Π C (cid:48) and safe Π ( C (cid:48) | C (cid:48) ) . Likewise, for any C (cid:48) satisfying C → Π C (cid:48) holds, there exists C (cid:48) satisfying C → Π C (cid:48) and safe Π ( C (cid:48) | C (cid:48) ) .Proof. How to take C (cid:48) according to C (cid:48) and vice versa can be decided in a straight-forward way that we do not explicitly describe here. The property safe Π ( C (cid:48) | C (cid:48) )can be justified by the following observations. No Unexpected Changes on Unrelated Data.
The safety on the extended memoryfootprint ensures that operations on hotly accessed data do not affect unrelateddata. Here, the following property plays a role: when readout H , hot ( a :: P T | ˆ v ; ˆ X , ˆ M ) holds and P is of form own or mut α , {| hot( a + k ) | k ∈ [ T ] |} ⊆ ˆ M holds. Preservation of the Safety on the Extended Abstract Variable Summary.
It canbe shown in a similar way to the proof of Lemma 1.
Preservation of Safety on the Extended Memory Footprint.
It can be shown bystraightforward case analysis.One important point is that, on lifetime elimination ( now α ), a frozen hotaccess (hot † α ( a )) can be safely made active (hot active ( a )), because there are nocold accesses on a , which is guaranteed by the type system.Another point is that swap ( swap ( ∗ x, ∗ y )) does not change the extendedmemory footprint. (cid:117)(cid:116) Property 3 (Safety on the Concrete Configuration is Preserved).
For any Π and C , C (cid:48) such that safe Π ( C ) and C → Π C (cid:48) hold, safe Π ( C (cid:48) ) is satisfied. Proof.
It immediately follows by Lemma 5. (cid:117)(cid:116)
C.6 Equivalence of the COS-based and AOS-based Models
After introducing some easy lemmas, we prove the equivalence of the COS-based and AOS-based models (Theorem 3), relying on the bisimulation lemmaLemma 5 proved above. Finally, we achieve the complete proof of Theorem 1. ustHorn: CHC-based Verification for Rust Programs (full version) 49
Lemma 6.
Take any Π , simple f and L . For any F , H and F , the followingequivalence holds, if L = entry or the statement at L is of form return x . safe H ( F :: Γ Π,f,L | F ) ⇐⇒ safe Π (cid:0) [ f, L ] F | H (cid:12)(cid:12) [ f, L ] ∅ F | ( ∅ , ∅ ) (cid:1) (The safe H judgment is defined in § By straightforward induction. (cid:117)(cid:116)
Lemma 7.
For any Π and C of form [ f, L ] F | H , when f is simple, there isat most one C satisfying safe Π ( C | C ) .Proof. By straightforward induction. The simpleness of f has made the situationeasy, because abstract variables do not occur in C . (cid:117)(cid:116) Lemma 8.
For any Π and C of form [ f, L ] ∅ F | ( ∅ , ∅ ) , when f is simple and C is safe, there exists C satisfying safe Π ( C | C ) .Proof. By straightforward construction. (cid:117)(cid:116)
Theorem 3 (Equivalence of the COS-based Model and the AOS-basedModel).
For any Π and simple f , f COS Π is equivalent to f AOS Π .Proof. Let us show that f COS Π ( v , . . . , v n − , w ) ⇐⇒ f AOS Π ( v , . . . , v n − , w )holds for any values v , . . . , v n − , w of the sorts ( | T | ) , . . . , ( | T n − | ) , ( | U | ), where Σ Π,f = ( x : T , . . . , x n − : T n − ) → U .(= ⇒ ) . By assumption, we can take concrete configurations C , . . . , C N satisfy-ing the following (for some L , y , F , H , F (cid:48) and H (cid:48) ). C → Π · · · → Π C N final Π ( C N ) C = [ f, entry ] F | H C N = [ f, L ] F (cid:48) | H (cid:48) safe H (cid:0) F :: Γ Π,f, entry (cid:12)(cid:12) { ( x i , v i ) | i ∈ [ n ] } (cid:1) safe H (cid:48) (cid:0) F (cid:48) :: Γ Π,f,L (cid:12)(cid:12) { ( y, w ) } (cid:1) By Lemma 6, taking abstract configurations C := [ f, entry ] ∅ { ( x i , v i ) | i ∈ [ n ] } | ( ∅ , ∅ ) C (cid:48) N := [ f, L ] ∅ { ( y, w ) } | ( ∅ , ∅ ) , we have safe Π ( C | C ) and safe Π ( C N | C (cid:48) N ). By Lemma 4, safe Π ( C ) also holds.By Lemma 5, we can take C , . . . , C N satisfying C → Π · · · → Π C N , final Π ( C N ),and safe Π ( C k +1 | C k +1 ) (for any k ∈ [ N ]).Since safe Π ( C N | C N ) and safe Π ( C N | C (cid:48) N ) hold, by Lemma 7 we have C N = C (cid:48) N . Therefore, f AOS Π ( v , . . . , v n − , w ) holds.( ⇐ =) . By assumption, we can take abstract configurations C , . . . , C N satisfyingthe following (for some L and y ). C → Π · · · → Π C N final Π ( C N ) C = [ f, entry ] ∅ { ( x i , v i ) | i ∈ [ n ] } | ( ∅ , ∅ ) C N = [ f, L ] ∅ { ( y, w ) } | ( ∅ , ∅ ) By Lemma 8, there exists C such that safe Π ( C | C ) holds. By Lemma 5,we can take C , . . . , C N satisfying C → Π · · · → Π C N , final Π ( C N ), andsafe Π ( C k +1 | C k +1 ) (for any k ∈ [ N ]). C and C N have form C = [ f, entry ] F | H C N = [ f, L ] F (cid:48) | H (cid:48) , and by Lemma 6 the following judgments hold.safe H (cid:0) F :: Γ Π,f, entry (cid:12)(cid:12) { ( x i , v i ) | i ∈ [ n ] } (cid:1) safe H (cid:48) (cid:0) F (cid:48) :: Γ Π,f,L (cid:12)(cid:12) { ( y, w ) } (cid:1) Therefore, f COS Π ( v , . . . , v n − , w ) holds. (cid:117)(cid:116)(cid:117)(cid:116)