[PDF] Memory-Efficient Fixpoint Computation

Abstract

Practical adoption of static analysis often requires trading precision for performance. This paper focuses on improving the memory efficiency of abstract interpretation without sacrificing precision or time efficiency. Computationally, abstract interpretation reduces the problem of inferring program invariants to computing a fixpoint of a set of equations. This paper presents a method to minimize the memory footprint in Bourdoncle's iteration strategy, a widely-used technique for fixpoint computation. Our technique is agnostic to the abstract domain used. We prove that our technique is optimal (i.e., it results in minimum memory footprint) for Bourdoncle's iteration strategy while computing the same result. We evaluate the efficacy of our technique by implementing it in a tool called MIKOS, which extends the state-of-the-art abstract interpreter IKOS. When verifying user-provided assertions, MIKOS shows a decrease in peak-memory usage to 4.07% (24.57x) on average compared to IKOS. When performing interprocedural buffer-overflow analysis, MIKOS shows a decrease in peak-memory usage to 43.7% (2.29x) on average compared to IKOS.

Full PDF

MMemory-Eﬃcient Fixpoint Computation

Sung Kook Kim , Arnaud J. Venet , and Aditya V. Thakur University of California, Davis CA 95616, USA {sklkim,avthakur}@ucdavis.edu Facebook, Inc., Menlo Park CA 94025, USA [email protected]

Abstract.

Practical adoption of static analysis often requires tradingprecision for performance. This paper focuses on improving the memoryeﬃciency of abstract interpretation without sacriﬁcing precision or timeeﬃciency. Computationally, abstract interpretation reduces the problemof inferring program invariants to computing a ﬁxpoint of a set of equa-tions. This paper presents a method to minimize the memory footprintin Bourdoncle’s iteration strategy, a widely-used technique for ﬁxpointcomputation. Our technique is agnostic to the abstract domain used. Weprove that our technique is optimal (i.e., it results in minimum memoryfootprint) for Bourdoncle’s iteration strategy while computing the sameresult. We evaluate the eﬃcacy of our technique by implementing it in atool called

Mikos , which extends the state-of-the-art abstract interpreterIKOS. When verifying user-provided assertions,

Mikos shows a decreasein peak-memory usage to . % ( . × ) on average compared to IKOS.When performing interprocedural buﬀer-overﬂow analysis, Mikos showsa decrease in peak-memory usage to . % ( . × ) on average comparedto IKOS. Abstract interpretation [15] is a general framework for expressing static analysisof programs. Program invariants inferred by an abstract interpreter are used inclient applications such as program veriﬁers, program optimizers, and bug ﬁnd-ers. To extract the invariants, an abstract interpreter computes a ﬁxpoint of anequation system approximating the program semantics. The eﬃciency and preci-sion of the abstract interpreter depends on the iteration strategy , which speciﬁesthe order in which the equations are applied during ﬁxpoint computation.The recursive iteration strategy developed by Bourdoncle [10] is widely usedfor ﬁxpoint computation in academic and industrial abstract interpreters suchas NASA IKOS [11], Crab [32], Facebook SPARTA [17], Kestrel TechnologyCodeHawk [48], and Facebook Infer [12]. Extensions to Bourdoncle’s approachthat improve precision [1] and time eﬃciency [27] have also been proposed.This paper focuses on improving the memory eﬃciency of abstract inter-pretation. This is an important problem in practice because large memory re-quirements can prevent clients such as compilers and developer tools from using a r X i v : . [ c s . P L ] S e p S. Kim et al.

Fig. 1: Control-ﬂow graph G sophisticated analyses. This has motivated approaches for eﬃcient implementa-tions of abstract domains [26,4,44], including techniques that trade precision foreﬃciency [18,5,25].This paper presents a technique for memory-eﬃcient ﬁxpoint computation.Our technique minimizes the memory footprint in Bourdoncle’s recursive iter-ation strategy. Our approach is agnostic to the abstract domain and does notsacriﬁce time eﬃciency. We prove that our technique exhibits optimal peak-memory usage for the recursive iteration strategy while computing the sameﬁxpoint (§ 3). Speciﬁcally, our approach does not change the iteration order butprovides a mechanism for early deallocation of abstract values. Thus, there isno loss of precision when improving memory performance. Furthermore, such“backward compatibility” ensures that existing implementations of Bourdoncle’sapproach can be replaced without impacting clients of the abstract interpreter,an important requirement in practice.Suppose we are tasked with proving assertions at program points and ofthe control-ﬂow graph G ( V, ) in Figure 1. Current approaches (§ 2.1) allocateabstract values for each program point during ﬁxpoint computation, check theassertions at and after ﬁxpoint computation, and then deallocate all abstractvalues. In contrast, our approach deallocates abstract values and checks theassertions during ﬁxpoint computation while guaranteeing that the results ofthe checks remain the same and that the peak-memory usage is optimal.We prove that our approach deallocates abstract values as soon as they are nolonger needed during ﬁxpoint computation. Providing this theoretical guaranteeis challenging for arbitrary irreducible graphs such as G . For example, assumingthat node is analyzed after , one might think that the ﬁxpoint iterator candeallocate the abstract value at once it analyzes . However, is part of thestrongly-connected component { , } , and the ﬁxpoint iterator might need toiterate over node multiple times. Thus, deallocating the abstract value at when node is ﬁrst analyzed will lead to incorrect results. In this case, theearliest that the abstract value at can be deallocated is after the stabilizationof component { , } .Furthermore, we prove that our approach performs the assertion checks asearly as possible during ﬁxpoint computation. Once the assertions are checked,the associated abstract values are deallocated. For example, consider the asser-tion check at node . Notice that is part of the strongly-connected components { , } and { , , , } . Checking the assertion the ﬁrst time node is analyzedcould lead to an incorrect result because the abstract value at has not con-verged. The earliest that the check at node can be executed is after the conver-gence of the component { , , , } . Apart from being able to deallocate abstractvalues earlier, early assertion checks provide partial results on timeout. emory-Eﬃcient Fixpoint Computation 3 The key theoretical result (Theorem 1) is that our iteration strategy ismemory-optimal (i.e., it results in minimum memory footprint) while computingthe same result as Bourdoncle’s approach. Furthermore, we present an almost-linear time algorithm to compute this optimal iteration strategy (§ 4).We have implemented this memory-optimal ﬁxpoint computation in a toolcalled

Mikos (§ 5), which extends the state-of-the-art abstract interpreter forC/C++, IKOS [11]. We compared the memory eﬃciency of

Mikos and IKOSon the following tasks:T1 Verifying user-provided assertions. Task T1 represents the program-veriﬁcationclient of a ﬁxpoint computation. We performed interprocedural analysis of784 SV-COMP 2019 benchmarks [6] using reduced product of DiﬀerenceBound Matrix with variable packing [18] and congruence [21] domains.T2 Proving absence of buﬀer overﬂows. Task T2 represents the bug-ﬁnding andcompiler-optimization client of ﬁxpoint computation. In the context of bugﬁnding, a potential buﬀer overﬂow can be reported to the user as a potentialbug. In the context of compiler optimization, code to check buﬀer-accesssafety can be elided if the buﬀer access is veriﬁed to be safe. We performedinterprocedural buﬀer overﬂow analysis of 426 open-source programs usingthe interval abstract domain.On Task T1,

Mikos shows a decrease in peak-memory usage to . % ( . × )on average compared to IKOS. For instance, peak-memory required to ana-lyze the SV-COMP 2019 benchmark ldv-3.16-rc1/205_9a-net-rtl8187 de-creased from 46 GB to 56 MB . Also, while ldv-3.14/usb-mxl111sf spaced outin IKOS with 64 GB memory limit, peak-memory usage was 21 GB for Mikos .On Task T2,

Mikos shows a decrease in peak-memory usage to . % ( . × )on average compared to IKOS. For instance, peak-memory required to analyzea benchmark ssh-keygen decreased from 30 GB to 1 GB.The contributions of the paper are as follows: – A memory-optimal technique for Bourdoncle’s recursive iteration strategythat does not sacriﬁce precision or time eﬃciency (§ 3). – An almost-linear time algorithm to construct our memory-eﬃcient iterationstrategy (§ 4). – Mikos , an interprocedural implementation of our approach (§ 5). – An empirical evaluation of the eﬃcacy of

Mikos using a large set of Cbenchmarks (§ 6).§ 2 presents necessary background on ﬁxpoint computation, including Bourdon-cle’s approach; § 7 presents related work; § 8 concludes.

This section presents background on ﬁxpoint computation that will allow us toclearly state the problem addressed in this paper (§2.3). This section is not meant

S. Kim et al. to capture all possible approaches to implementing abstract interpretation. How-ever, it does capture the relevant high-level structure of abstract-interpretationimplementations such as IKOS [11].Consider an equation system Φ whose dependency graph is G ( V, ) . Thegraph G typically reﬂects the control-ﬂow graph of the program, though this isnot always true. The aim is to ﬁnd the ﬁxpoint of the equation system Φ : Pre [ v ] = (cid:71) { Post [ p ] | p v } v ∈ V (1) Post [ v ] = τ v ( Pre [ v ]) v ∈ V The maps

Pre : V → A and Post : V → A maintain the abstract values at thebeginning and end of each program point, where A is an abstract domain. Theabstract transformer τ v : A → A overapproximates the semantics of programpoint v ∈ V . After ﬁxpoint computation, Pre [ v ] is an invariant for v ∈ V .Client applications of the abstract interpreter typically query these ﬁxpointvalues to perform assertion checks, program optimizations, or report bugs. Let V C ⊆ V be the set of program points where such checks are performed, and let ϕ v : A → bool represent the corresponding functions that performs the check foreach v ∈ V C . To simplify presentation, we assume that the check function merelyreturns true or false . Thus, after ﬁxpoint computation, the client applicationcomputes ϕ v ( Pre [ v ]) for each v ∈ V C .The exact least solution of the system Eq. 1 can be computed using Kleeneiteration provided A is Noetherian. However, most interesting abstract domainsrequire the use of widening ( (cid:79) ) to ensure termination followed by narrowing to improve the post solution. In this paper, we use “ﬁxpoint” to refer to suchan approximation of the least ﬁxpoint. Furthermore, for simplicity of presen-tation, we restrict our description to a simple widening strategy. However, ourimplementation (§ 5) uses more sophisticated widening and narrowing strategiesimplemented in state-of-the-art abstract interpreters [11,1].An iteration strategy speciﬁes the order in which the individual equationsare applied, where widening is used, and how convergence of the equation sys-tem is checked. For clarity of exposition, we introduce a Fixpoint Machine ( FM ) consisting of an imperative set of instructions. An FM program represents a par-ticular iteration strategy used for ﬁxpoint computation. The syntax of FixpointMachine programs is deﬁned by the following grammar: Prog ::= exec v | repeat v [ Prog ] | Prog (cid:35)

Prog , v ∈ V (2)Informally, the instruction exec v applies τ v for v ∈ V ; the instruction repeat v [ P ] repeatedly executes the FM program P until convergence and performs wideningat v ; and the instruction P (cid:35) P executes FM programs P and P in sequence.The syntax (Eq. 2) and semantics (Figure 2) of the Fixpoint Machine are suf-ﬁcient to express Bourdoncle’s recursive iteration strategy (§ 2.1), a widely-usedapproach for ﬁxpoint computation [10]. We also extend the notion of iterationstrategy to perform memory management of the abstract values as well as per-form checks during ﬁxpoint computation (§ 2.2). emory-Eﬃcient Fixpoint Computation 5 In this section, we review Bourdoncle’s recursive iteration strategy [10] and showhow to generate the corresponding FM program.Bourdoncle’s iteration strategy relies on the notion of weak topological order-ing (WTO) of a directed graph G ( V, ) . A WTO is deﬁned using the notion ofa hierarchical total ordering (HTO) of a set. Deﬁnition 1. A hierarchical total ordering H of a set S is a well parenthesizedpermutation of S without two consecutive “(”. (cid:4) An HTO H is a string over the alphabet S augmented with left and right paren-thesis. Alternatively, we can denote an HTO H by the tuple ( S, (cid:22) , ω ) , where (cid:22) is the total order induced by H over the elements of S and ω : V → V . Theelements between two matching parentheses are called a component , and the ﬁrstelement of a component is called the head . Given l ∈ S , ω ( l ) is the set of headsof the components containing l . We use C : V → V to denote the mapping froma head to its component. Example 1.

Let V = { , , , , , , , , } . An example HTO H ( V, (cid:22) , ω ) is ( ( ) ) ( ) . ω (3) = { } , ω (5) = { , } , and ω (1) = ∅ . It has compo-nents C (4) = { , } , C (7) = { , } and C (3) = { , } ∪ C (4) . (cid:4) A weak topological ordering (WTO) W of a directed graph G ( V, ) is anHTO H ( V, (cid:22) , ω ) satisfying certain constraints listed below: Deﬁnition 2. A weak topological ordering W ( V, (cid:22) , ω ) of a directed graph G ( V, ) is an HTO H ( V, (cid:22) , ω ) such that for every edge u → v , either (i) u ≺ v , or(ii) v (cid:22) u and v ∈ ω ( u ) . (cid:4) Example 2.

HTO H in Example 1 is a WTO W of the graph G (Figure 1). (cid:4) Given a directed graph G ( V, ) that represents the dependency graph of theequation system, Bourdoncle’s approach uses a WTO W ( V, (cid:22) , ω ) of G to derivethe following recursive iteration strategy : – The total order (cid:22) determines the order in which the equations are applied.The equation after a component is applied only after the component stabi-lizes. – The stabilization of a component C ( h ) is determined by checking the stabi-lization of the head h . – Widening is performed at each of the heads.We now show how the WTO can be represented using the syntax of our FixpointMachine ( FM ) deﬁned in Eq. 2. The following function genProg : WTO → Prog maps a given WTO W to an FM program: genProg ( W ) :=  repeat v [ genProg ( W (cid:48) ) ] if W = ( v W (cid:48) ) genProg ( W ) (cid:35) genProg ( W ) if W = W W exec v if W = v (3) S. Kim et al.

Each node v ∈ V is mapped to a single FM instruction by genProg ; we use Inst [ v ] to refer to this FM instruction corresponding to v . Note that if v ∈ V is a head,then Inst [ v ] is an instruction of the form repeat v [ . . . ] , else Inst [ v ] is exec v . Example 3.

The WTO W of graph G (Figure 1) is ( ( ) ) ( ) .The corresponding FM program is P = genProg ( W ) = exec (cid:35) exec (cid:35) repeat [ repeat [ exec ] (cid:35) exec ] (cid:35) repeat [ exec ] (cid:35) exec . Thecolors used for brackets and parentheses are to more clearly indicate the corre-spondence between the WTO and the FM program. Note that Inst [1] = exec ,and Inst [4] = repeat [ exec ] . (cid:4) Ignoring the text in gray, the semantics of the FM instructions shown in Figure 2capture Bourdoncle’s recursive iteration strategy. The semantics are parameter-ized by the graph G ( V, ) and a WTO W ( V, (cid:22) , ω ) . In this paper, we extend the notion of iteration strategy to indicate when abstractvalues are deallocated and when checks are executed. The gray text in Figure 2shows the semantics of the FM instructions that handle these issues. The right-hand side of ⇒ is executed if the left-hand side evaluates to true. Recall thatthe set V C ⊆ V is the set of program points that have assertion checks. Themap Ck : V C → bool records the result of executing the check ϕ u ( Pre [ u ]) foreach u ∈ V C . Thus, the output of the FM program is the map Ck . In practice, thefunctions ϕ u are expensive to compute. Furthermore, they often write the resultto a database or report the output to a user. Consequently, we assume that onlythe ﬁrst execution of ϕ u is recorded in Ck .The memory conﬁguration M is a tuple ( Dpost , Achk , Dpost (cid:96) , Dpre (cid:96) ) where – The map

Dpost : V → V controls the deallocation of values in Post thathave no further use. If v = Dpost [ u ] , Post [ u ] is deallocated after the exe-cution of Inst [ v ] . – The map

Achk : V C → V controls when the check function ϕ u correspondingto u ∈ V C is executed, after which the corresponding Pre value is deallo-cated. If

Achk [ u ] = v , assertions in u are checked and Pre [ u ] is subsequentlydeallocated after the execution of Inst [ v ] . – The map

Dpost (cid:96) : V → V control deallocation of Post values that arerecomputed and overwritten in the loop of a repeat instruction before itsnext use. If v ∈ Dpost (cid:96) [ u ] , Post [ u ] is deallocated in the loop of Inst [ v ] . – The map

Dpre (cid:96) : V C → V control deallocation of Pre values that recom-puted and overwritten in the loop of a repeat instruction before its nextuse. If v ∈ Dpre (cid:96) [ u ] , Pre [ u ] is deallocated in the loop of Inst [ v ] .To simplify presentation, the semantics in Figure 2 does not make explicit theallocations of abstract values: if a Post or Pre value that has been deallocatedis accessed, then it is allocated and initialized to ⊥ . emory-Eﬃcient Fixpoint Computation 7 G ( V, ) , WTO W ( V, (cid:22) , ω ) ,V C ⊆ V , memory conﬁguration M ( Dpost , Achk , Dpost (cid:96) , Dpre (cid:96) ) (cid:74) exec v (cid:75) M def = Pre [ v ] ← (cid:71) { Post [ p ] | p v } foreach u ∈ V : v = Dpost [ u ] ⇒ free Post [ u ] Post [ v ] ← τ v ( Pre [ v ]) v / ∈ V C ⇒ free Pre [ v ] foreach u ∈ V C : v = Achk [ u ] ⇒ Ck [ u ] ← ϕ u ( Pre [ u ]); free Pre [ u ] (cid:74) repeat v [ P ] (cid:75) M def = tpre ← (cid:71) { Post [ p ] | p v ∧ v / ∈ ω ( p ) } (cid:27) Preamble do {foreach u ∈ V : v ∈ Dpost (cid:96) [ u ] ⇒ free Post [ u ] foreach u ∈ V C : v ∈ Dpre (cid:96) [ u ] ⇒ free Pre [ u ] Pre [ v ] , Post [ v ] ← tpre, τ v ( tpre ) (cid:74) P (cid:75) M tpre ← Pre [ v ] (cid:79) (cid:71) { Post [ p ] | p v } } while ( tpre (cid:54)(cid:118) Pre [ v ])  Loop foreach u ∈ V : v = Dpost [ u ] ⇒ free Post [ u ] v / ∈ V C ⇒ free Pre [ v ] foreach u ∈ V C : v = Achk [ u ] ⇒ Ck [ u ] ← ϕ u ( Pre [ u ]); free Pre [ u ]  Postamble (cid:74) P (cid:35) P (cid:75) M def = (cid:74) P (cid:75) M (cid:74) P (cid:75) M Fig. 2: The semantics of the Fixpoint Machine ( FM ) instructions of Eq. 2. Two memory conﬁgurations are equivalent if they result in the same values foreach check in the program:

Deﬁnition 3.

Given an FM program P , memory conﬁguration M is equivalentto M , denoted by (cid:74) P (cid:75) M = (cid:74) P (cid:75) M , iﬀ for all u ∈ V C , we have Ck [ u ] = Ck [ u ] , where Ck and Ck are the check maps corresponding to execution of P using M and M , respectively. (cid:4) The default memory conﬁguration M dﬂt performs checks and deallocationsat the end of the FM program after ﬁxpoint has been computed. Deﬁnition 4.

Given an FM program P , the default memory conﬁguration M dﬂt ( Dpost dﬂt , Achk dﬂt , Dpost (cid:96) dﬂt , Dpre (cid:96) dﬂt ) is Dpost dﬂt [ v ] = z for all v ∈ V , S. Kim et al.

Achk dﬂt [ c ] = z for all c ∈ V C , and Dpost (cid:96) dﬂt = Dpre (cid:96) dﬂt = ∅ , where z is thelast instruction in P . (cid:4) Example 4.

Consider the FM program P from Example 3. Let V C = { , } . Dpost dﬂt [ v ] = 9 for all v ∈ V . That is, all Post values are deallocated at theend of the ﬁxpoint computation. Also,

Achk dﬂt [4] =

Achk dﬂt [9] = 9 , meaningthat assertion checks also happen at the end.

Dpost (cid:96) dﬂt = Dpre (cid:96) dﬂt = ∅ , sothe FM program does not clear abstract values whose values will be recomputedand overwritten in a loop of repeat instruction. (cid:4) Given an FM program P , a memory conﬁguration M is valid for P iﬀ it isequivalent to the default conﬁguration; i.e., (cid:74) P (cid:75) M = (cid:74) P (cid:75) M dﬂt .Furthermore, a valid memory conﬁguration M is optimal for a given FM program iﬀ memory footprint of (cid:74) P (cid:75) M is smaller than or equal to that of (cid:74) P (cid:75) M (cid:48) for all valid memory conﬁguration M (cid:48) . The problem addressed in this paper canbe stated as:Given an FM program P , ﬁnd an optimal memory conﬁguration M .An optimal conﬁguration should deallocate abstract values during ﬁxpointcomputation as soon they are no longer needed. The challenge is ensuring thatthe memory conﬁguration remains valid even without knowing the number ofloop iterations for repeat instructions. § 3 gives the optimal memory conﬁgura-tion for the FM program P from Example 3. M opt This section provides a declarative speciﬁcation of an optimal memory conﬁg-uration M opt ( Dpost opt , Achk opt , Dpost (cid:96) opt , Dpre (cid:96) opt ). The proofs of thetheorems in this section can be found in Appendix A. § 4 presents an eﬃcientalgorithm for computing M opt . Deﬁnition 5.

Given a WTO W ( V, (cid:22) , ω ) of a graph G ( V, ) , the nesting rela-tion N is a tuple ( V, (cid:22) N ) where x (cid:22) N y iﬀ x = y or y ∈ ω ( x ) for x, y ∈ V . (cid:4) Let (cid:98)(cid:98) v (cid:101) (cid:22) N def = { w ∈ V | v (cid:22) N w } ; that is, (cid:98)(cid:98) v (cid:101) (cid:22) N equals the set containing v and the heads of components in the WTO that contain v . The nesting relation N ( V, (cid:22) N ) is a forest ; i.e. a partial order such that for all v ∈ V , ( (cid:98)(cid:98) v (cid:101) (cid:22) N , (cid:22) N ) is achain (Theorem 4, Appendix A.1). Example 5.

For the WTO W of G in Example 2, N ( V, (cid:22) N ) is: .Note that (cid:98)(cid:98) (cid:101) (cid:22) N = { , , } , forming a chain (cid:22) N (cid:22) N . (cid:4) emory-Eﬃcient Fixpoint Computation 9 Dpost opt

Dpost opt [ u ] = v implies that v is the earliest instruction at which Post [ u ] canbe deallocated while ensuring that there are no subsequents reads of Post [ u ] during ﬁxpoint computation. We cannot conclude Dpost opt [ u ] = v from a de-pendency u v as illustrated in the following example. Example 6.

Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1. Although , memory conﬁguration with Dpost [2] = 8 is notvalid:

Post [2] is read by

Inst [8] , which is executed repeatedly as part of

Inst [7] ;if

Dpost [2] = 8 , Post [2] is deallocated the ﬁrst time

Inst [8] is executed, andsubsequent executions of

Inst [8] will read ⊥ as the value of Post [2] . (cid:4) In general, for a dependency u v , we must ﬁnd the head of maximal com-ponent that contains v but not u as the candidate for Dpost opt [ u ] . By choosingthe head of maximal component, we remove the possibility of having a largercomponent whose head’s repeat instruction can execute Inst [ v ] after deallo-cating Post [ u ] . If there is no component that contains v but not u , we simplyuse v as the candidate. The following Lift operator gives us the candidate of

Dpost opt [ u ] for u v : Lift ( u, v ) def = max (cid:22) N (( (cid:98)(cid:98) v (cid:101) (cid:22) N \ (cid:98)(cid:98) u (cid:101) (cid:22) N ) ∪ { v } ) (4) (cid:98)(cid:98) v (cid:101) (cid:22) N gives us v and the heads of components that contain v . Subtracting (cid:98)(cid:98) u (cid:101) (cid:22) N removes the heads of components that also contain u . We put back v to accountfor the case when there is no component containing v but not u and (cid:98)(cid:98) v (cid:101) (cid:22) N \(cid:98)(cid:98) u (cid:101) (cid:22) N is empty. Because N ( V, (cid:22) N ) is a forest, (cid:98)(cid:98) v (cid:101) (cid:22) N and (cid:98)(cid:98) u (cid:101) (cid:22) N are chains, and hence, (cid:98)(cid:98) v (cid:101) (cid:22) N \ (cid:98)(cid:98) u (cid:101) (cid:22) N is also a chain. Therefore, maximum is well-deﬁned. Example 7.

Consider the nesting relation N ( V, (cid:22) N ) from Example 5. Lift (2 , max (cid:22) N (( { , } \ { } ) ∪ { } ) = 7 . We see that is the head of the maximalcomponent containing but not . Also, Lift (5 ,

4) = max (cid:22) N (( { , }\{ , , } ) ∪{ } ) = 4 . There is no component that contains but not . (cid:4) For each instruction u , we now need to ﬁnd the last instruction from amongthe candidates computed using Lift . Notice that deallocations of

Post valuesare at a postamble of repeat instructions in Figure 2. Therefore, we cannot usethe total order (cid:22) of a WTO to ﬁnd the last instruction: (cid:22) is the order in whichthe instruction begin executing, or the order in which preamble s are executed.

Example 8.

Let

Dpost to [ u ] def = max (cid:22) { Lift ( u, v ) | u v } , u ∈ V , an incor-rect variant of Dpost opt that uses the total order (cid:22) . Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1 and nesting rela-tion N ( V, (cid:22) N ) is in Example 5. Post [5] has dependencies and . Lift (5 ,

4) = 4 , Lift (5 ,

3) = 3 . Now,

Dpost to [5] = 4 because (cid:22) . However,a memory conﬁguration with Dpost [5] = 4 is not valid:

Inst [4] is nested in

Inst [3] . Due to the deletion of

Post [5] in Inst [4] , Inst [3] will read ⊥ as thevalue of Post [5] . (cid:4) To ﬁnd the order in which the instructions ﬁnish executing, or the order inwhich postamble s are executed, we deﬁne the relation ( V, ≤ ) , using the totalorder ( V, (cid:22) ) and the nesting relation ( V, (cid:22) N ) : x ≤ y def = x (cid:22) N y ∨ ( y (cid:54)(cid:22) N x ∧ x (cid:22) y ) (5)In the deﬁnition of ≤ , the nesting relation (cid:22) N takes precedence over (cid:22) . ( V, ≤ ) is a total order (Theorem 5, Appendix A.1). Intuitively, the total order ≤ movesthe heads in the WTO to their corresponding closing parentheses ‘)’. Example 9.

For G (Figure 1) and its WTO W , ( ( ) ) ( ) , we have ≤ ≤ ≤ ≤ ≤ ≤ ≤ ≤ . Note that (cid:22) while ≤ . Postamble of repeat [ . . . ] is executed after Inst [6] , while preamble of repeat [ . . . ] isexecuted before Inst [6] . (cid:4) We can now deﬁne

Dpost opt . Given a nesting relation N ( V, (cid:22) N ) for the graph G ( V, ) , Dpost opt is deﬁned as:

Dpost opt [ u ] def = max ≤ { Lift ( u, v ) | u v } , u ∈ V (6) Example 10.

Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1 and nesting relation N ( V, (cid:22) N ) is in Example 5. An optimal mem-ory conﬁguration M opt deﬁned by Eq. 6 is: Dpost opt [1] = 2 , Dpost opt [2] =

Dpost opt [3] =

Dpost opt [8] = 7 , Dpost opt [4] = 6 , Dpost opt [5] =

Dpost opt [6] = 3 , Dpost opt [7] =

Dpost opt [9] = 9 . Successors of u are ﬁrst lifted to compute Dpost opt [ u ] . For example, tocompute Dpost opt [2] , ’s successors, and , are lifted to Lift (2 ,

3) = 3 and

Lift (2 ,

8) = 7 . To compute

Dpost opt [5] , ’s successors, and , are lifted to Lift (5 ,

3) = 3 and

Lift (5 ,

4) = 4 . Then, the maximum (as per the total order ≤ )of the lifted successors is chosen as Dpost opt [ u ] . Because ≤ , Dpost opt [2] = 7 .Thus,

Post [2] is deleted in

Inst [7] . Also, because ≤ , Dpost opt [5] = 3 , and

Post [5] is deleted in

Inst [3] . (cid:4) Achk opt

Achk opt [ u ] = v implies that v is the earliest instruction at which the assertioncheck at u ∈ V C can be executed so that the invariant passed to the assertioncheck function ϕ u is the same as when using M dﬂt . Thus, guaranteeing the samecheck result Ck .Because an instruction can be executed multiple times in a loop, we cannotsimply execute the assertion checks right after the instruction, as illustrated bythe following example. Example 11.

Consider the FM program P from Example 3. Let V C = { , } .A memory conﬁguration with Achk [4] = 4 is not valid:

Inst [4] is executedrepeatedly as part of

Inst [3] , and the ﬁrst value of

Pre [4] may not be the ﬁnalinvariant. Consequently, executing ϕ ( Pre [4]) in Inst [4] may not give the sameresult as executing it in

Inst [9] ( Achk dﬂt [4] = 9 ). (cid:4) emory-Eﬃcient Fixpoint Computation 11 In general, because we cannot know the number of iterations of the loop in a repeat instruction, we must wait for the convergence of the maximal componentthat contains the assertion check. After the maximal component converges, the FM program never visits the component again, making Pre values of the elementsinside the component ﬁnal. Only if the element is not in any component can itsassertion check be executed right after its instruction.Given a nesting relation N ( V, (cid:22) N ) for the graph G ( V, ) , Achk opt is deﬁnedas:

Achk opt [ u ] def = max (cid:22) N (cid:98)(cid:98) u (cid:101) (cid:22) N , u ∈ V C (7)Because N ( V, (cid:22) N ) is a forest, ( (cid:98)(cid:98) u (cid:101) (cid:22) N , (cid:22) N ) is a chain. Hence, max (cid:22) N is well-deﬁned. Example 12.

Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1 and nesting relation N ( V, (cid:22) N ) is in Example 5. Suppose that V C = { , } . Achk opt [4] = max (cid:22) N { , } = 3 and Achk opt [9] = max (cid:22) N { } = 9 . (cid:4) Dpost (cid:96) opt v ∈ Dpost (cid:96) [ u ] implies that Post [ u ] can be deallocated at v because it is recom-puted and overwritten in the loop of a repeat instruction before a subsequentuse of Post [ u ] . Dpost (cid:96) opt [ u ] must be a subset of (cid:98)(cid:98) u (cid:101) (cid:22) N : only the instructions of the headsof components that contain v recompute Post [ u ] . We can further rule outthe instruction of the heads of components that contain Dpost opt [ u ] , because Inst [ Dpost opt [ u ]] deletes Post [ u ] . We add back Dpost opt [ u ] to Dpost (cid:96) opt when u is contained in Dpost opt [ u ] , because deallocation by Dpost opt happensafter the deallocation by

Dpost (cid:96) opt .Given a nesting relation N ( V, (cid:22) N ) for the graph G ( V, ) , Dpost (cid:96) opt is de-ﬁned as:

Dpost (cid:96) opt [ u ] def = ( (cid:98)(cid:98) u (cid:101) (cid:22) N \ (cid:98)(cid:98) d (cid:101) (cid:22) N ) ∪ ( u (cid:22) N d ? { d } : ∅ ) , u ∈ V (8)where d = Dpost opt [ u ] as deﬁned in Eq. 6, and ( b ? x : y ) is the ternaryconditional choice operator. Example 13.

Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1, nesting relation N ( V, (cid:22) N ) is in Example 5, and Dpost opt is inExample 10.

Dpost (cid:96) opt [1] = { } , Dpost (cid:96) opt [2] = { } , Dpost (cid:96) opt [3] = { } , Dpost (cid:96) opt [4] = { } , Dpost (cid:96) opt [5] = { , , } , Dpost (cid:96) opt [6] = { , } , Dpost (cid:96) opt [7] = { } , Dpost (cid:96) opt [8] = { , } , Dpost (cid:96) opt [9] = { } . For , Dpost opt [7] = 9 . Because (cid:54)(cid:22) N , Dpost (cid:96) opt [7] = (cid:98)(cid:98) (cid:101) (cid:22) N \ (cid:98)(cid:98) (cid:101) (cid:22) N = { } . Therefore, Post [7] is deleted in each iteration of the loop of

Inst [7] . While

Inst [9] reads

Post [7] in the future, the particular values of

Post [7] that aredeleted by

Dpost (cid:96) opt [7] are not used in

Inst [9] . For , Dpost opt [5] = 3 . Because (cid:22) N , Dpost (cid:96) opt [5] = (cid:98)(cid:98) (cid:101) (cid:22) N \ (cid:98)(cid:98) (cid:101) (cid:22) N ∪ { } = { , , } . (cid:4) Dpre (cid:96) opt v ∈ Dpre (cid:96) [ u ] implies that Pre [ u ] can be deallocated at v because it is recom-puted and overwritten in the loop of a repeat instruction before a subsequentuse of Pre [ u ] . Dpre (cid:96) opt [ u ] must be a subset of (cid:98)(cid:98) u (cid:101) (cid:22) N : only the instructions of the heads ofcomponents that contain v recompute Pre [ u ] . If Inst [ u ] is a repeat instruction, Pre [ u ] is required to perform widening. Therefore, u must not be contained in Dpre (cid:96) opt [ u ] . Example 14.

Consider the FM program P from Example 3. Let V C = { , } .A memory conﬁguration with Dpre (cid:96) [4] = { , } is not valid, because Inst [4] would read ⊥ as the value of Post [4] when performing widening. (cid:4)

Given a nesting relation N ( V, (cid:22) N ) for the graph G ( V, ) , Dpre (cid:96) opt is deﬁnedas:

Dpre (cid:96) opt [ u ] def = (cid:98)(cid:98) u (cid:101) (cid:22) N \ { u } , u ∈ V C (9) Example 15.

Consider the FM program P from Example 3, whose graph G ( V, ) is in Figure 1 and nesting relation N ( V, (cid:22) N ) is in Example 5. Let V C = { , } . Dpre (cid:96) opt [4] = { , } \ { } = { } and Dpre (cid:96) opt [9] = { } \ { } = ∅ . Therefore, Pre [4] is deleted in each loop iteration of

Inst [3] . (cid:4) The following theorem is proved in Appendix A.2:

Theorem 1.

The memory conﬁguration M opt ( Dpost opt , Achk opt , Dpost (cid:96) opt , Dpre (cid:96) opt ) is optimal. M opt Algorithm

GenerateFMProgram (Algorithm 1) is an almost-linear time algorithmfor computing an FM program P and optimal memory conﬁguration M opt fora given directed graph G ( V, ) . Algorithm 1 adapts the bottom-up WTO con-struction algorithm presented in Kim et al. [27]. In particular, Algorithm 1 ap-plies the genProg rules (Eq. 3) to generate the FM program from a WTO. Line 32generates exec instructions for non-heads. Line 39 generates repeat instructionsfor heads, with their bodies ([ ]) generated on Line 35. Finally, instructions aremerged on Line 48 to construct the ﬁnal output P .Algorithm GenerateFMProgram utilizes a disjoint-set data structure. Opera-tion rep ( v ) returns the representative of the set that contains v . In Line 5, thesets are initialized to be rep ( v ) = v for all v ∈ V . Operation merge ( v, h ) onLine 43 merges the sets containing v and h , and assigns h to be the representa-tive for the combined set. lca D ( u, v ) is the lowest common ancestor of u, v in thedepth-ﬁrst forest D [47]. Cross and forward edges are initially removed from (cid:48) on Line 7, making the graph ( V, (cid:48) ∪ B ) reducible. Restoring it on Line 9 when h = lca D ( u, v ) restores some reachability while keeping ( V, (cid:48) ∪ B ) reducible. emory-Eﬃcient Fixpoint Computation 13 Algorithm 1:

GenerateFMProgram ( G ) Input:

Directed graph G ( V, ) Output: FM program pgm , M opt ( Dpost opt , Achk opt , Dpost (cid:96) opt , Dpre (cid:96) opt ) D := DepthFirstForest ( G ) B := back edges in D CF := cross & forward edges in D (cid:48) := \ B for v ∈ V do rep ( v ) := v ; R [ v ] := ∅ P := ∅ removeAllCrossFwdEdges() for h ∈ V in descending DFN D do restoreCrossFwdEdges( h ) generateFMInstruction( h ) pgm := connectFMInstructions() return pgm , M opt def removeAllCrossFwdEdges() : for ( u, v ) ∈ CF do (cid:48) := (cid:48) \ { ( u, v ) } (cid:46) Lowest common ancestor. R [ lca D ( u, v )] := R [ lca D ( u, v )] ∪ { ( u, v ) } def restoreCrossFwdEdges( h ) : (cid:48) := (cid:48) ∪ { ( u, rep ( v )) | ( u, v ) ∈ R [ h ] } def findNestedSCCs( h ) : B h := { rep ( p ) | ( p, h ) ∈ B } N h := ∅ (cid:46) Nested SCCs except h . W := B h \ { h } (cid:46) Worklist. while there exists v ∈ W do W, N h := W \ { v } , N h ∪ [ v ] for u s.t. u (cid:48) v do if rep ( u ) / ∈ N h ∪ { h } ∪ W then W := W ∪ { rep ( u ) } return N h , B h def generateFMInstruction( h ) : N h , B h := findNestedSCCs( h ) if B h = ∅ then Inst [ h ] := exec h return for v ∈ N h in desc. postDFN D do Inst [ h ] := Inst [ h ] (cid:35) Inst [ v ] (cid:63) for u s.t. u (cid:48) v do (cid:63) Dpost opt [ u ] := v (cid:63) T [ u ] := rep ( u ) Inst [ h ] := repeat h [Inst [ h ] ] (cid:63) for u s.t. u B h do (cid:63) Dpost opt [ u ] := T [ u ] := h for v ∈ N h do merge( v, h ) ; P := P ∪ { ( v, h ) } def connectFMInstructions() : pgm := (cid:15) (cid:46) Empty program. for v ∈ V in desc. postDFN D do if rep ( v ) = v then pgm := pgm (cid:35) Inst [ v ] (cid:63) for u s.t. u (cid:48) v do (cid:63) Dpost opt [ u ] := v (cid:63) T [ u ] := rep ( u ) (cid:63) if v ∈ V C then (cid:63) Achk opt [ v ] := rep ( v ) (cid:63) Dpre (cid:96) opt [ v ] := (cid:98)(cid:98) v, rep ( v ) (cid:101)(cid:101) P ∗ \{ v } (cid:63) for v ∈ V do (cid:63) Dpost (cid:96) opt [ v ] := (cid:98)(cid:98) v, T [ v ] (cid:101)(cid:101) P ∗ return pgm Lines indicated by (cid:63) in Algorithm 1 compute M opt . Lines 37, 41, and 50compute Dpost opt . Due to the speciﬁc order in which the algorithm traverses G , Dpost opt [ u ] is overwritten with greater values (as per the total order ≤ ) onthese lines, making the ﬁnal value to be the maximum among the successors. Lift is implicitly applied when restoring the edges in restoreCrossFwdEdges :edge u v whose

Lift ( u, v ) = h is replaced to u (cid:48) h on Line 9. Dpost (cid:96) opt is computed using an auxiliary map T : V → V and a relation P : V × V . At the end of the algorithm, T [ u ] will be the maximum element (as per (cid:22) N ) in Dpost (cid:96) opt [ u ] . That is, T [ u ] = max (cid:22) N (( (cid:98)(cid:98) u (cid:101) (cid:22) N \(cid:98)(cid:98) d (cid:101) (cid:22) N ) ∪ ( u (cid:22) N d ? { d } : ∅ ) ) ,where d = Dpost opt [ u ] . Once T [ u ] is computed by lines 38, 41, and 51, thetransitive reduction of (cid:22) N , P , is used to ﬁnd all elements of Dpost (cid:96) opt [ u ] onLine 56. P is computed on Line 43. Note that P ∗ = (cid:22) N and (cid:98)(cid:98) x, y (cid:101)(cid:101) P ∗ def = { v | x P ∗ v ∧ v P ∗ y } . Achk and

Dpre (cid:96) are computed on Lines 53 and 54, respectively.

Example 16.

Consider the graph G (Figure 1). Labels of vertices indicate adepth-ﬁrst numbering ( DFN ) of G . The graph edges are classiﬁed into tree,back, cross, and forward edges using the corresponding depth-ﬁrst forest [14].Cross and forward edges of G , CF = { (2 , } , are removed on Line 7. Because lca D (2 ,

8) = 2 , the removed edge (2 , will be restored in Line 9 when h = 2 . Itis restored as (2 , , because the disjoint set { } would have already been mergedwith { } on Line 43 when h = 7 , making rep (8) to be when h = 2 .The for-loop on Line 8 visits nodes in V in a descending DFN : from to .Calling generateFMInstruction( h ) on Line 10 generates Inst [ h ] , an FM in-struction for h . When h = 9 , because the SCC whose entry is is trivial, exec is generated in Line 32. When h = 3 , the SCC whose entry is is non-trivial, withthe entries of its nested SCCs, N h = { , } . These entries are visited in a topo-logical order (descending postDFN ), , , and their instructions are connected onLine 35 to generate repeat [Inst [4] (cid:35) Inst [6] ] on Line 39. Visiting the nodes ina descending DFN guarantees the instruction of nested SCCs to be present, andremoving the cross and forward edges ensures each SCC to have a single entry.Table 1 shows some relevant steps and values within generateFMInstruction .Finally, calling connectFMInstructions on Line 11 connects the instruc-tions of entries of outermost SCCs, which is detected by the boolean expression rep ( v ) = v , in a topological order (descending postDFN ) to generate the ﬁnal FM program. For the given example, it visits the nodes in the order of , , , , and , correctly generating the FM program on Line 48.Due to (cid:48) and (cid:48) , Dpost opt [2] is set to and then to on Line 50.Due to B and B , Dpost opt [5] is set to and then to in Line 41. Achk opt [4] is set to 3, as rep (4) = 3 in Line 53. T [7] is set to on Line 51, and Dpost (cid:96) opt [7] is set to { } on Line 56. T [5] is set to and then to on Line 41,making Dpost (cid:96) opt [5] to be { , , } . Because rep (4) = 3 , Dpre (cid:96) opt [4] is set to { } in Line 54. (cid:4) The proofs of the following theorems are in Appendix A.3:

Theorem 2.

GenerateFMProgram correctly computes M opt , deﬁned in § 3. emory-Eﬃcient Fixpoint Computation 15 Table 1: Relevant steps and values within

GenerateFMProgram when applied tograph G of Example 16 Major iteration h = 4 h = 3 Line 35

Inst [5]

Inst [4] (cid:35)

Inst [6]

Line 39 repeat [ exec ] repeat [ repeat [ exec ] (cid:35) exec ] Line 37

Dpost opt [4] = 5

Dpost opt [4] = 6 , Dpost opt [3] = 4

Line 38 T [4] = 4 T [4] = 4 , T [3] = 3 Line 41

Dpost opt [5] = T [5] = 4 Dpost opt [6] = T [6] = Dpost opt [5] = T [5] = 3 Line 43 Sets { } , { } merged. Sets { } , { , } , { } merged. Theorem 3.

Running time of

GenerateFMProgram is almost-linear.

We have implemented our approach in a tool called

Mikos , which extendsNASA’s IKOS [11], a WTO-based abstract-interpreter for C/C++.

Mikos in-herits all abstract domains and widening-narrowing strategies from IKOS. Itincludes the localized narrowing strategy [1] that intertwines the increasing anddecreasing sequences.

Abstract domains in IKOS.

IKOS uses the state-of-the-art implementationsof abstract domains comparable to those used in industrial abstract interpreterssuch as Astrée. In particular, IKOS implements the interval abstract domain [15]using functional data-structures based on Patricia Trees [35]. Astrée implementsintervals using OCaml’s map data structure that uses balanced trees [8, Section6.2]. As shown in [35, Section 5], the Patricia Trees used by IKOS are moreeﬃcient when you have to merge data structures, which is required often dur-ing abstract interpretation. Also, IKOS uses memory-eﬃcient variable packingDiﬀerence Bound Matrix (DBM) relational abstract domain [18], similar to thevariable packing relational domains employed by Astrée [5, Section 3.3.2].

Interprocedural analysis in IKOS.

IKOS implements context-sensitive in-terprocedural analysis by means of dynamic inlining, much like the semanticexpansion of function bodies in Astrée [16, Section 5]: at a function call, formaland actual parameters are matched, the callee is analyzed, and the return valueat the call site is updated after the callee returns; a function pointer is resolvedto a set of callees and the results for each call are joined; IKOS returns top for acallee when a cycle is found in this dynamic call chain. To prevent running theentire interprocedural analysis again at the assertion checking phase, invariantsat exits of the callees are additionally cached during the ﬁxpoint computation.

Interprocedural extension of

Mikos . Although the description of our it-eration strategy focused on intraprocedural analysis, it can be extended to in-terprocedural analysis as follows. Suppose there is a call to function f1 from abasic block contained in component C . Any checks in this call to f1 must bedeferred until we know that the component C has stabilized. Furthermore, if function f1 calls the function f2 , then the checks in f2 must also be deferreduntil C converges. In general, checks corresponding to a function call f must bedeferred until the maximal component containing the call is stabilized.When the analysis of callee returns in Mikos , only

Pre values for the de-ferred checks remain. They are deallocated when the checks are performed orwhen the component containing the call is reiterated.

The experiments in this section were designed to answer the following questions:

RQ0 [Accuracy]

Does

Mikos (§5) have the same analysis results as IKOS?

RQ1 [Memory footprint]

How does the memory footprint of

Mikos com-pare to that of IKOS?

RQ2 [Runtime]

How does the runtime of

Mikos compare to that of IKOS?

Experimental setup

All experiments were run on Amazon EC2 r5.2xlargeinstances (64 GiB memory, 8 vCPUs, 4 physical cores), which use Intel XeonPlatinum 8175M processors. Processors have L1, L2, and L3 caches of sizes 1.5MiB (data: 0.75 MiB, instruction: 0.75 MiB), 24 MiB, and 33 MiB, respectively.Linux kernel version 4.15.0-1051-aws was used, and gcc 7.4.0 was used to compileboth

Mikos and IKOS. Dedicated EC2 instances and BenchExec [7] were usedto improve reliability of the results. Time and space limit were set to an hour and64 GB, respectively. The experiments can be reproduced using https://github.com/95616ARG/mikos_sas2020.

Benchmarks

We evaluated

Mikos on two tasks that represent diﬀerent clientapplications of abstract interpretation, each using diﬀerent benchmarks describedin Sections 6.1 and 6.2. In both tasks, we excluded benchmarks that did notcomplete in both

IKOS and

Mikos given the time and space budget. Therewere no benchmarks for which IKOS succeeded but

Mikos failed to complete.Benchmarks for which IKOS took less than 5 seconds were also excluded. Mea-surements for benchmarks that took less than 5 seconds are summarized inAppendix B.

Metrics

To answer RQ1, we deﬁne and use memory reduction ratio (MRR) :MRR def = Memory footprint of

Mikos / Memory footprint of IKOS (10)The smaller the MRR, the greater reduction in peak-memory usage in

Mikos .If MRR is less than 1,

Mikos has smaller memory footprint than IKOS.For RQ2, we report the speedup , which is deﬁned as below:Speedup def = Runtime of IKOS / Runtime of

Mikos (11)The larger the speedup, the greater reduction in runtime in

Mikos . If speedupis greater than 1,

Mikos is faster than IKOS. emory-Eﬃcient Fixpoint Computation 17(a) Min MRR: 0.895. Max MRR: 0.001.Geometric means: (i) 0.044 (when × s areignored), (ii) 0.041 (when measurementsuntil timeout/spaceout are used for × s).29 non-completions in IKOS. (b) Min speedup: 0.87 × . Max speedup:1.80 × . Geometric mean: 1.29 × . Note that × s are ignored as they space out fast inIKOS compared to in Mikos where theycomplete.

Fig. 3:

Task T1.

Log-log scatter plots of (a) memory footprint and (b) runtimeof IKOS and

Mikos

As a sanity check for our theoretical results, weexperimentally validated Theorem 1 by comparing the analysis results reportedby IKOS and

Mikos . Mikos used a valid memory conﬁguration, reporting thesame analysis results as IKOS. Recall that Theorem 1 also proves that the ﬁx-point computation in

Mikos is memory-optimal (, it results in minimum memoryfootprint).

Benchmarks

For Task T1, we selected all 2928 benchmarks from DeviceDriver-sLinux64, ControlFlow, and Loops categories of SV-COMP 2019 [6]. These cat-egories are well suited for numerical analysis, and have been used in recentworks [45,46,27]. From these benchmarks, we removed 435 benchmarks thattimed out in both

Mikos and IKOS, and 1709 benchmarks that took less than5 seconds in IKOS. That left us with

SV-COMP 2019 benchmarks.

Abstract domain

Task T1 used the reduced product of Diﬀerence BoundMatrix (DBM) with variable packing [18] and congruence [21]. This domain ismuch richer and more expressive than the interval domain used in task T2.

Task

Task T1 consists of using the results of interprocedural ﬁxpoint com-putation to prove user-provided assertions in the SV-COMP benchmarks. Eachbenchmark typically has one assertion to prove.

RQ1: Memory footprint of

Mikos compared to IKOS

Figure 3(a) showsthe measured memory footprints in a log-log scatter plot. For Task T1, the MRR MB –

MB) (b) 25% – 50% (

MB –

MB)(c) 50% – 75% (

MB –

MB) (d) 75% – 100% (

MB –

MB)

Fig. 4: Histograms of MRR (Eq. 10) in task T1 for diﬀerent ranges. Figure 4(a)shows the distribution of benchmarks that used from MB to

MB inIKOS. They are the bottom 25% in terms of the memory footprint in IKOS.The distribution signiﬁcantly tended toward a smaller MRR in the upper range.(Eq. 10) ranged from 0.895 to 0.001. That is, the memory footprint decreased to0.1% in the best case. For all benchmarks,

Mikos had smaller memory footprintthan IKOS: MRR was less than 1 for all benchmarks, with all points below the y = x line in Figure 3(a). On average, Mikos required only 4.1% of the memoryrequired by IKOS, with an MRR 0.041 as the geometric mean.As Figure 3(a) shows, reduction in memory tended to be greater as thememory footprint in the baseline IKOS grew. For the top 25% benchmarks withlargest memory footprint in IKOS, the geometric mean of MRRs was 0.009. Thistrend is further conﬁrmed by the histograms in Figure 4. While a similar trendwas observed in task T2, the trend was signiﬁcantly stronger in task T1. Table 3in Appendix B lists

RQ1 results for speciﬁc benchmarks.

RQ2: Runtime of

Mikos compared to IKOS

Figure 3(b) shows the mea-sured runtime in a log-log scatter plot. We measured both the speedup (Eq. 11)and the diﬀerence in the runtimes. For fair comparison, we excluded 29 bench-marks that did not complete in IKOS. This left us with 755 SV-COMP 2019 emory-Eﬃcient Fixpoint Computation 19(a) Min MRR: 0.998. Max MRR: 0.022.Geometric means: (i) 0.436 (when × s areignored), (ii) 0.437 (when measurementsuntil timeout/spaceout are used for × s).1 non-completions in IKOS. (b) Min speedup: 0.88 × . Max speedup:2.83 × . Geometric mean: 1.08 × . Note that × s are ignored as they space out fast inIKOS compared to in Mikos where theycomplete.

Fig. 5:

Task T2.

Log-log scatter plots of (a) memory footprint and (b) runtimeof IKOS and

Mikos , with an hour timeout and 64 GB spaceout. Benchmarksthat did not complete in IKOS are marked × . All × s completed in Mikos .Benchmarks below y = x required less memory or runtime in Mikos .benchmarks. Out of these 755 benchmarks, 740 benchmarks had speedup > .The speedup ranged from 0.87 × to 1.80 × , with geometric mean of 1.29 × . Thediﬀerence in runtimes (runtime of IKOS − runtime of Mikos ) ranged from − . s to . s, with arithmetic mean of . s. Table 4 in Appendix Blists RQ2 results for speciﬁc benchmarks.

Benchmarks

For Task T2, we selected all 1503 programs from the oﬃcialArch Linux core packages that are primarily written in C and whose LLVMbitcodes are obtainable by gllvm [20]. These include, but are not limited to, coreutils , dhcp , gnupg , inetutils , iproute , nmap , openssh , vim , etc. Fromthese benchmarks, we removed 76 benchmarks that timed out and 8 benchmarksthat spaced out in both Mikos and IKOS. Also, 994 benchmarks that tookless than 5 seconds in IKOS were removed. That left us with open-sourcebenchmarks.

Abstract domain

Task T2 used the interval abstract domain [15]. Using aricher domain like DBM caused IKOS and

Mikos to timeout on most bench-marks.

Task

Task T2 consists of using the results of interprocedural ﬁxpoint compu-tation to prove the safety of buﬀer accesses. In this task, most program pointshad checks. MB –

MB) (b) 25% – 50% (

MB –

MB)(c) 50% – 75% (

MB –

MB) (d) 75% – 100% (

MB –

MB)

Fig. 6: Histograms of MRR (Eq. 10) in task T2 for diﬀerent ranges. Figure 6(a)shows the distribution of benchmarks that used from MB to

MB inIKOS. They are the bottom 25% in terms of the memory footprint in IKOS.The distribution slightly tended toward a smaller MRR in the upper range.

RQ1: Memory footprint of

Mikos compared to IKOS

Figure 5(a) showsthe measured memory footprints in a log-log scatter plot. For Task T2, MRR(Eq. 10) ranged from 0.998 to 0.022. That is, the memory footprint decreased to2.2% in the best case. For all benchmarks,

Mikos had smaller memory footprintthan IKOS: MRR was less than 1 for all benchmarks, with all points below the y = x line in Figure 5(a). On average, Mikos ’s memory footprint was less thanhalf of that of IKOS, with an MRR 0.437 as the geometric mean. Table 5 inAppendix B lists

RQ1 results for speciﬁc benchmarks.

RQ2: Runtime of

Mikos compared to IKOS

Figure 5(b) shows the mea-sured runtime in a log-log scatter plot. We measured both the speedup (Eq. 11)and the diﬀerence in the runtimes. For fair comparison, we excluded 1 benchmarkthat did not complete in IKOS. This left us with 425 open-source benchmarks.Out of these 425 benchmarks, 331 benchmarks had speedup > . The speedupranged from 0.88 × to 2.83 × , with geometric mean of 1.08 × . The diﬀerence in emory-Eﬃcient Fixpoint Computation 21 runtimes (runtime of IKOS − runtime of Mikos ) ranged from − . s to . s, with arithmetic mean of . s. Table 6 in Appendix B lists RQ2 results for speciﬁc benchmarks.

Abstract interpretation has a long history of designing time and memory ef-ﬁcient algorithms for speciﬁc abstract domains, which exploit variable packingand clustering and sparse constraints [46,45,44,43,25,19,13,23]. Often these tech-niques represent a trade-oﬀ between precision and performance of the analysis.Nonetheless, such techniques are orthogonal to the abstract-domain agnosticapproach discussed in this paper. Approaches for improving precision via so-phisticated widening and narrowing strategies [22,2,3] are also orthogonal to ourmemory-eﬃcient iteration strategy.

Mikos inherits the interleaved widening-narrowing strategy implemented in the baseline IKOS abstract interpreter.As noted in § 1, Bourdoncle’s approach [10] is used in many industrial andacademic abstract interpreters [11,32,17,48,12]. Thus, improving memory eﬃ-ciency of WTO-based exploration is of great applicability to real-world staticanalysis. Astrée is one of the few, if not only, industrial abstract interpretersthat does not use WTO exploration, because it assumes that programs do nothave gotos and recursion [8, Section 2.1], and is targeted towards a speciﬁc classof embedded C code [5, Section 3.2]. Such restrictions makes is easier to com-pute when an abstract value will not be used anymore by naturally followingthe abstract syntax tree [29, Section 3.4.3]. In contrast,

Mikos works for gen-eral programs with goto and recursion, which requires the use of WTO-basedexploration.Generic ﬁxpoint-computation approaches for improving running time of ab-stract interpretation have also been explored [52,30,27]. Most recently, Kim etal. [27] present the notion of weak partial order (WPO), which generalizes thenotion of WTO that is used in this paper. Kim et al. describe a parallel ﬁx-point algorithm that exploits maximal parallelism while computing the sameﬁxpoint as the WTO-based algorithm. Reasoning about correctness of concur-rent algorithms is complex; hence, we decided to investigate an optimal memorymanagement scheme in the sequential setting ﬁrst. However, we believe it wouldbe possible to extend our WTO-based result to one that uses WPO.The nesting relation described in § 3 is closely related to the notion of LoopNesting Forest [36,37], as observed in Kim et al. [27]. The almost-linear timealgorithm

GenerateFMProgram is an adaptation of LNF construction algorithmby Ramalingam [36]. The

Lift operation in § 3 is similar to the outermost-loop-excluding (OLE) operator introduced by Rastello [38, Section 2.4.4].Seidl et al. [42] present time and space improvements to a generic ﬁxpointsolver, which is closest in spirit to the problem discussed in this paper. Forimproving space eﬃciency, their approach recomputes values during ﬁxpointcomputation, and does not prove optimality, unlike our approach. However, thesetting discussed in their work is also more generic compared to ours; we assumea static dependency graph for the equation system.

Abstract interpreters such as Astrée [8] and CodeHawk [48] are implementedin OCaml, which provides a garbage collector. However, merely using a refer-ence counting garbage collector will not reduce peak memory usage of ﬁxpointcomputation. For instance, the reference count of

Pre [ u ] can be decreased tozero only after the ﬁnal check/assert that uses Pre [ u ] . If the checks are all con-ducted at the end of the analysis (as is currently done in prior tools), then usinga reference counting garbage collector will not reduce peak memory usage. Incontrast, our approach lifts the checks as early as possible enabling the analysisto free the abstract values as early as possible.Symbolic approaches for applying abstract transformers during ﬁxpoint com-putation [24,40,28,41,50,49,51] allow the entire loop body to be encoded as asingle formula. This might appear to obviate the need for Pre and

Post valuesfor individual basic blocks within the loop; by storing the

Pre value only at theheader, such a symbolic approach might appear to reduce the memory footprint.First, this scenario does not account for the fact that

Pre values need to becomputed and stored if basic blocks in the loop have checks. Note that if thereare no checks within the loop body, then our approach would also only store the

Pre value at the loop header. Second, such symbolic approaches only performintraprocedural analysis [24]; additional abstract values would need to be storeddepending on how function calls are handled in interprocedural analysis. Third,due to the use of SMT solvers in such symbolic approaches, the memory foot-print might not necessarily reduce, but might increase if one takes into accountthe memory used by the SMT solver.Sparse analysis [34,33] and database-backed analysis [54] improve the mem-ory cost of static analysis. For speciﬁc classes of static analysis such as the IFDSframework [39], there have been approaches for improving the time and memoryeﬃciency [9,31,53,55].

This paper presented an approach for memory-eﬃcient abstract interpretationthat is agnostic to the abstract domain used. Our approach is memory-optimaland produces the same result as Bourdoncle’s approach without sacriﬁcing timeeﬃciency. We extended the notion of iteration strategy to intelligently deallocateabstract values and perform assertion checks during ﬁxpoint computation. Weprovided an almost-linear time algorithm that constructs this iteration strategy.We implemented our approach in a tool called

Mikos , which extended the ab-stract interpreter IKOS. Despite the use of state-of-the-art implementation ofabstract domains, IKOS had a large memory footprint on two analysis tasks.

Mikos was shown to eﬀectively reduce it. When verifying user-provided asser-tions in SV-COMP 2019 benchmarks,

Mikos showed a decrease in peak-memoryusage to . % ( . × ) on average compared to IKOS. When performing in-terprocedural buﬀer-overﬂow analysis of open-source programs, Mikos showeda decrease in peak-memory usage to . % ( . × ) on average compared toIKOS. emory-Eﬃcient Fixpoint Computation 23 References

1. Amato, G., Scozzari, F.: Localizing widening and narrowing. In: Static Analysis- 20th International Symposium, SAS 2013, Seattle, WA, USA, June 20-22, 2013.Proceedings. pp. 25–42 (2013). https://doi.org/10.1007/978-3-642-38856-9_42. Amato, G., Scozzari, F., Seidl, H., Apinis, K., Vojdani, V.: Eﬃciently inter-twining widening and narrowing. Sci. Comput. Program. , 1–24 (2016).https://doi.org/10.1016/j.scico.2015.12.0053. Apinis, K., Seidl, H., Vojdani, V.: Enhancing top-down solving with widening andnarrowing. In: Probst, C.W., Hankin, C., Hansen, R.R. (eds.) Semantics, Logics,and Calculi - Essays Dedicated to Hanne Riis Nielson and Flemming Nielson on theOccasion of Their 60th Birthdays. Lecture Notes in Computer Science, vol. 9560,pp. 272–288. Springer (2016). https://doi.org/10.1007/978-3-319-27810-0_144. Bagnara, R., Hill, P.M., Zaﬀanella, E.: The parma polyhedra library: Towarda complete set of numerical abstractions for the analysis and veriﬁcation ofhardware and software systems. Sci. Comput. Program. (1-2), 3–21 (2008).https://doi.org/10.1016/j.scico.2007.08.0015. Bertrane, J., Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A.,Rival, X.: Static analysis by abstract interpretation of embedded criticalsoftware. ACM SIGSOFT Software Engineering Notes (1), 1–8 (2011).https://doi.org/10.1145/1921532.19215536. Beyer, D.: Automatic veriﬁcation of C and java programs: SV-COMP 2019.In: Tools and Algorithms for the Construction and Analysis of Systems -25 Years of TACAS: TOOLympics, Held as Part of ETAPS 2019, Prague,Czech Republic, April 6-11, 2019, Proceedings, Part III. pp. 133–155 (2019).https://doi.org/10.1007/978-3-030-17502-3_97. Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: requirements and solu-tions. STTT (1), 1–29 (2019). https://doi.org/10.1007/s10009-017-0469-y8. Blanchet, B., Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux,D., Rival, X.: Design and implementation of a special-purpose static program ana-lyzer for safety-critical real-time embedded software. In: Mogensen, T.Æ., Schmidt,D.A., Sudborough, I.H. (eds.) The Essence of Computation, Complexity, Analysis,Transformation. Essays Dedicated to Neil D. Jones [on occasion of his 60th birth-day]. Lecture Notes in Computer Science, vol. 2566, pp. 85–108. Springer (2002).https://doi.org/10.1007/3-540-36377-7_59. Bodden, E.: Inter-procedural data-ﬂow analysis with IFDS/IDE and soot. In:Bodden, E., Hendren, L.J., Lam, P., Sherman, E. (eds.) Proceedings of theACM SIGPLAN International Workshop on State of the Art in Java Programanalysis, SOAP 2012, Beijing, China, June 14, 2012. pp. 3–8. ACM (2012).https://doi.org/10.1145/2259051.225905210. Bourdoncle, F.: Eﬃcient chaotic iteration strategies with widenings. In: For-mal Methods in Programming and Their Applications, International Conference,Akademgorodok, Novosibirsk, Russia, June 28 - July 2, 1993, Proceedings. pp.128–141 (1993). https://doi.org/10.1007/BFb003970411. Brat, G., Navas, J.A., Shi, N., Venet, A.: IKOS: A framework for static analysisbased on abstract interpretation. In: Software Engineering and Formal Methods- 12th International Conference, SEFM 2014, Grenoble, France, September 1-5,2014. Proceedings. pp. 271–277 (2014). https://doi.org/10.1007/978-3-319-10431-7_204 S. Kim et al.12. Calcagno, C., Distefano, D.: Infer: An automatic program veriﬁer for memory safetyof C programs. In: Bobaru, M.G., Havelund, K., Holzmann, G.J., Joshi, R. (eds.)NASA Formal Methods - Third International Symposium, NFM 2011, Pasadena,CA, USA, April 18-20, 2011. Proceedings. Lecture Notes in Computer Science,vol. 6617, pp. 459–465. Springer (2011). https://doi.org/10.1007/978-3-642-20398-5_3313. Chawdhary, A., King, A.: Compact diﬀerence bound matrices. In: Chang, B.E.(ed.) Programming Languages and Systems - 15th Asian Symposium, APLAS 2017,Suzhou, China, November 27-29, 2017, Proceedings. Lecture Notes in ComputerScience, vol. 10695, pp. 471–490. Springer (2017). https://doi.org/10.1007/978-3-319-71237-6_2314. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms,3rd Edition. MIT Press (2009)15. Cousot, P., Cousot, R.: Abstract interpretation: A uniﬁed lattice model for staticanalysis of programs by construction or approximation of ﬁxpoints. In: Con-ference Record of the Fourth ACM Symposium on Principles of ProgrammingLanguages, Los Angeles, California, USA, January 1977. pp. 238–252 (1977).https://doi.org/10.1145/512950.51297316. Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux, D., Rival, X.:The astreé analyzer. In: Sagiv, S. (ed.) Programming Languages and Systems, 14thEuropean Symposium on Programming,ESOP 2005, Held as Part of the Joint Eu-ropean Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh,UK, April 4-8, 2005, Proceedings. Lecture Notes in Computer Science, vol. 3444,pp. 21–30. Springer (2005). https://doi.org/10.1007/978-3-540-31987-0_317. Facebook: Sparta. https://github.com/facebookincubator/SPARTA (2020)18. Gange, G., Navas, J.A., Schachte, P., Søndergaard, H., Stuckey, P.J.: An ab-stract domain of uninterpreted functions. In: Veriﬁcation, Model Checking,and Abstract Interpretation - 17th International Conference, VMCAI 2016, St.Petersburg, FL, USA, January 17-19, 2016. Proceedings. pp. 85–103 (2016).https://doi.org/10.1007/978-3-662-49122-5_419. Gange, G., Navas, J.A., Schachte, P., Søndergaard, H., Stuckey, P.J.: Exploitingsparsity in diﬀerence-bound matrices. In: Rival, X. (ed.) Static Analysis - 23rdInternational Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016, Pro-ceedings. Lecture Notes in Computer Science, vol. 9837, pp. 189–211. Springer(2016). https://doi.org/10.1007/978-3-662-53413-7_1020. gllvm. https://github.com/SRI-CSL/gllvm (2020)21. Granger, P.: Static analysis of arithmetical congruences. Interna-tional Journal of Computer Mathematics (3-4), 165–190 (1989).https://doi.org/10.1080/0020716890880377822. Halbwachs, N., Henry, J.: When the decreasing sequence fails. In: Static Analysis- 19th International Symposium, SAS 2012, Deauville, France, September 11-13,2012. Proceedings. pp. 198–213 (2012). https://doi.org/10.1007/978-3-642-33125-1_1523. Halbwachs, N., Merchat, D., Gonnord, L.: Some ways to reduce the space dimen-sion in polyhedra computations. Formal Methods Syst. Des. (1), 79–95 (2006).https://doi.org/10.1007/s10703-006-0013-224. Henry, J., Monniaux, D., Moy, M.: PAGAI: A path sensitive staticanalyser. Electron. Notes Theor. Comput. Sci. , 15–25 (2012).https://doi.org/10.1016/j.entcs.2012.11.003emory-Eﬃcient Fixpoint Computation 2525. Heo, K., Oh, H., Yang, H.: Learning a variable-clustering strategy for octagonfrom labeled data generated by a static analysis. In: Rival, X. (ed.) Static Analysis- 23rd International Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016,Proceedings. Lecture Notes in Computer Science, vol. 9837, pp. 237–256. Springer(2016). https://doi.org/10.1007/978-3-662-53413-7_1226. Jeannet, B., Miné, A.: Apron: A library of numerical abstract domains for staticanalysis. In: Bouajjani, A., Maler, O. (eds.) Computer Aided Veriﬁcation, 21stInternational Conference, CAV 2009, Grenoble, France, June 26 - July 2, 2009.Proceedings. Lecture Notes in Computer Science, vol. 5643, pp. 661–667. Springer(2009). https://doi.org/10.1007/978-3-642-02658-4_5227. Kim, S.K., Venet, A.J., Thakur, A.V.: Deterministic parallel ﬁxpoint computation.PACMPL (POPL), 14:1–14:33 (2020). https://doi.org/10.1145/337108228. Li, Y., Albarghouthi, A., Kincaid, Z., Gurﬁnkel, A., Chechik, M.: Symbolic opti-mization with SMT solvers. In: Jagannathan, S., Sewell, P. (eds.) The 41st AnnualACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,POPL ’14, San Diego, CA, USA, January 20-21, 2014. pp. 607–618. ACM (2014).https://doi.org/10.1145/2535838.253585729. Miné, A.: Tutorial on static inference of numeric invariants by abstract interpreta-tion. Foundations and Trends in Programming Languages (3-4), 120–372 (2017).https://doi.org/10.1561/250000003430. Monniaux, D.: The parallel implementation of the astrée static analyzer.In: Programming Languages and Systems, Third Asian Symposium, APLAS2005, Tsukuba, Japan, November 2-5, 2005, Proceedings. pp. 86–96 (2005).https://doi.org/10.1007/11575467_731. Naeem, N.A., Lhoták, O., Rodriguez, J.: Practical extensions to the IFDS al-gorithm. In: Gupta, R. (ed.) Compiler Construction, 19th International Confer-ence, CC 2010, Held as Part of the Joint European Conferences on Theory andPractice of Software, ETAPS 2010, Paphos, Cyprus, March 20-28, 2010. Proceed-ings. Lecture Notes in Computer Science, vol. 6011, pp. 124–144. Springer (2010).https://doi.org/10.1007/978-3-642-11970-5_832. Navas, J.A.: Crab: Cornucopia of abstractions: a language-agnostic library for ab-stract interpretation. https://github.com/seahorn/crab (2019)33. Oh, H., Heo, K., Lee, W., Lee, W., Park, D., Kang, J., Yi, K.: Global sparseanalysis framework. ACM Trans. Program. Lang. Syst. (3), 8:1–8:44 (2014).https://doi.org/10.1145/259081134. Oh, H., Heo, K., Lee, W., Lee, W., Yi, K.: Design and implementation of sparseglobal analyses for c-like languages. In: ACM SIGPLAN Conference on Program-ming Language Design and Implementation, PLDI ’12, Beijing, China - June 11 -16, 2012. pp. 229–238 (2012). https://doi.org/10.1145/2254064.225409235. Okasaki, C., Gill, A.: Fast mergeable integer maps. In: Workshop on ML. pp. 77–86(1998)36. Ramalingam, G.: Identifying loops in almost linear time. ACM Trans. Program.Lang. Syst. (2), 175–188 (1999). https://doi.org/10.1145/316686.31668737. Ramalingam, G.: On loops, dominators, and dominance frontiers. ACM Trans. Pro-gram. Lang. Syst. (5), 455–490 (2002). https://doi.org/10.1145/570886.57088738. Rastello, F.: On Sparse Intermediate Representations: Some Structural Propertiesand Applications to Just-In-Time Compilation. University works, Inria GrenobleRhône-Alpes (Dec 2012), https://hal.inria.fr/hal-00761555, habilitation à dirigerdes recherches, École normale supérieure de Lyon6 S. Kim et al.39. Reps, T.W., Horwitz, S., Sagiv, M.: Precise interprocedural dataﬂow anal-ysis via graph reachability. In: Conference Record of POPL’95: 22nd ACMSIGPLAN-SIGACT Symposium on Principles of Programming Languages,San Francisco, California, USA, January 23-25, 1995. pp. 49–61 (1995).https://doi.org/10.1145/199448.19946240. Reps, T.W., Sagiv, S., Yorsh, G.: Symbolic implementation of the best transformer.In: Steﬀen, B., Levi, G. (eds.) Veriﬁcation, Model Checking, and Abstract Interpre-tation, 5th International Conference, VMCAI 2004, Venice, Italy, January 11-13,2004, Proceedings. Lecture Notes in Computer Science, vol. 2937, pp. 252–266.Springer (2004). https://doi.org/10.1007/978-3-540-24622-0_2141. Reps, T.W., Thakur, A.V.: Automating abstract interpretation. In: Jobstmann, B.,Leino, K.R.M. (eds.) Veriﬁcation, Model Checking, and Abstract Interpretation -17th International Conference, VMCAI 2016, St. Petersburg, FL, USA, January17-19, 2016. Proceedings. Lecture Notes in Computer Science, vol. 9583, pp. 3–40.Springer (2016). https://doi.org/10.1007/978-3-662-49122-5_142. Seidl, H., Vogler, R.: Three improvements to the top-down solver. In: Sabel,D., Thiemann, P. (eds.) Proceedings of the 20th International Symposiumon Principles and Practice of Declarative Programming, PPDP 2018, Frank-furt am Main, Germany, September 03-05, 2018. pp. 21:1–21:14. ACM (2018).https://doi.org/10.1145/3236950.323696743. Singh, G., Püschel, M., Vechev, M.T.: Making numerical program analysis fast. In:Proceedings of the 36th ACM SIGPLAN Conference on Programming LanguageDesign and Implementation, Portland, OR, USA, June 15-17, 2015. pp. 303–313(2015). https://doi.org/10.1145/2737924.273800044. Singh, G., Püschel, M., Vechev, M.T.: Fast polyhedra abstract do-main. In: Castagna, G., Gordon, A.D. (eds.) Proceedings of the 44thACM SIGPLAN Symposium on Principles of Programming Languages,POPL 2017, Paris, France, January 18-20, 2017. pp. 46–59. ACM (2017).https://doi.org/10.1145/3009837.300988545. Singh, G., Püschel, M., Vechev, M.T.: Fast numerical program analysis withreinforcement learning. In: Computer Aided Veriﬁcation - 30th InternationalConference, CAV 2018, Held as Part of the Federated Logic Conference, FloC2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. pp. 211–229 (2018).https://doi.org/10.1007/978-3-319-96145-3_1246. Singh, G., Püschel, M., Vechev, M.T.: A practical construction for decomposingnumerical abstract domains. Proc. ACM Program. Lang. (POPL), 55:1–55:28(2018). https://doi.org/10.1145/315814347. Tarjan, R.E.: Applications of path compression on balanced trees. J. ACM (4),690–715 (1979). https://doi.org/10.1145/322154.32216148. Technology, K.: Codehawk. https://github.com/kestreltechnology/codehawk(2020)49. Thakur, A.V., Elder, M., Reps, T.W.: Bilateral algorithms for symbolic ab-straction. In: Miné, A., Schmidt, D. (eds.) Static Analysis - 19th InternationalSymposium, SAS 2012, Deauville, France, September 11-13, 2012. Proceedings.Lecture Notes in Computer Science, vol. 7460, pp. 111–128. Springer (2012).https://doi.org/10.1007/978-3-642-33125-1_1050. Thakur, A.V., Lal, A., Lim, J., Reps, T.W.: Posthat and all that: Automatingabstract interpretation. Electron. Notes Theor. Comput. Sci. , 15–32 (2015).https://doi.org/10.1016/j.entcs.2015.02.003emory-Eﬃcient Fixpoint Computation 2751. Thakur, A.V., Reps, T.W.: A method for symbolic computation of abstract opera-tions. In: Madhusudan, P., Seshia, S.A. (eds.) Computer Aided Veriﬁcation - 24thInternational Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceed-ings. Lecture Notes in Computer Science, vol. 7358, pp. 174–192. Springer (2012).https://doi.org/10.1007/978-3-642-31424-7_1752. Venet, A., Brat, G.P.: Precise and eﬃcient static array bound checking for largeembedded C programs. In: Proceedings of the ACM SIGPLAN 2004 Conference onProgramming Language Design and Implementation 2004, Washington, DC, USA,June 9-11, 2004. pp. 231–242 (2004). https://doi.org/10.1145/996841.99686953. Wang, K., Hussain, A., Zuo, Z., Xu, G.H., Sani, A.A.: Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In: Proceedings of the Twenty-Second International Con-ference on Architectural Support for Programming Languages and OperatingSystems, ASPLOS 2017, Xi’an, China, April 8-12, 2017. pp. 389–404 (2017).https://doi.org/10.1145/3037697.303774454. Weiss, C., Rubio-González, C., Liblit, B.: Database-backed program analysis forscalable error propagation. In: 37th IEEE/ACM International Conference on Soft-ware Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. pp.586–597 (2015). https://doi.org/10.1109/ICSE.2015.7555. Zuo, Z., Gu, R., Jiang, X., Wang, Z., Huang, Y., Wang, L., Li, X.: Bigspa:An eﬃcient interprocedural static analysis engine in the cloud. In: 2019IEEE International Parallel and Distributed Processing Symposium, IPDPS2019, Rio de Janeiro, Brazil, May 20-24, 2019. pp. 771–780. IEEE (2019).https://doi.org/10.1109/IPDPS.2019.000868 S. Kim et al. A Proofs

This section provides proofs of theorems presented in the paper.

A.1 Nesting forest ( V, (cid:22) N ) and total order ( V, ≤ ) in § 3 This section presents the theorems and proofs about (cid:22) N and ≤ deﬁned in § 3.A partial order ( S, R ) is a forest if for all x ∈ S , ( (cid:98)(cid:98) x (cid:101) R , R ) is a chain, where (cid:98)(cid:98) x (cid:101) R def = { y ∈ S | x R y } . Theorem 4. ( V, (cid:22) N ) is a forest.Proof. First, we show that ( V, (cid:22) N ) is a partial order. Let x, y, z be a vertex in V . – Reﬂexivity: x (cid:22) N x . This is true by the deﬁnition of (cid:22) N . – Transitivity: x (cid:22) N y and y (cid:22) N z implies x (cid:22) N z . (i) If x = y , x (cid:22) N z .(ii) Otherwise, by deﬁnition of (cid:22) N , y ∈ ω ( x ) . Furthermore, (ii-1) if y = z , z ∈ ω ( x ) ; and hence, x (cid:22) N z . (ii-2) Otherwise, z ∈ ω ( y ) , and by deﬁnitionof HTO, z ∈ ω ( x ) . – Anti-symmetry: x (cid:22) N y and y (cid:22) N x implies x = y . Suppose x (cid:54) = y . Bydeﬁnition of (cid:22) N and premises, y ∈ ω ( x ) and x ∈ ω ( y ) . Then, by deﬁnitionof HTO, x ≺ y and y ≺ x . This contradicts that (cid:22) is a total order.Next, we show that the partial order is a forest. Suppose there exists v ∈ V such that ( (cid:98)(cid:98) v (cid:101) (cid:22) N , (cid:22) N ) is not a chain. That is, there exists x, y ∈ (cid:98)(cid:98) v (cid:101) (cid:22) N such that x (cid:54)(cid:22) N y and y (cid:54)(cid:22) N x . Then, by deﬁnition of HTO, C ( x ) ∩ C ( y ) = ∅ . However, thiscontradicts that v ∈ C ( x ) and v ∈ C ( y ) . (cid:117)(cid:116) Theorem 5. ( V, ≤ ) is a total order.Proof. We prove the properties of a total order. Let x, y, z be a vertex in V . – Connexity: x ≤ y or y ≤ x . This follows from the connexity of the totalorder (cid:22) . – Transitivity: x ≤ y and y ≤ z implies x ≤ z . (i) Suppose x (cid:22) N y . (i-1) If y (cid:22) N z , by transitivity of (cid:22) N , x (cid:22) N z . (ii-2) Otherwise, z (cid:54)(cid:22) N y and y (cid:22) z .It cannot be z (cid:22) N x because transitivity of (cid:22) N implies z (cid:22) N y , which isa contradiction. Furthermore, it cannot be z ≺ x because y (cid:22) z ≺ x and x (cid:22) N y implies y ∈ ω ( z ) by the deﬁnition of HTO. By connexity of (cid:22) , x (cid:22) z .(ii) Otherwise y (cid:54)(cid:22) N x and x (cid:22) y . (ii-1) If y (cid:22) N z , z (cid:54)(cid:22) N x because, otherwise,transitivity of (cid:22) N will imply y (cid:22) N x . By connexity of (cid:22) , it is either x (cid:22) z or z ≺ x . If x (cid:22) z , x ≤ z . If z ≺ x , by deﬁnition of HTO, z ∈ ω ( z ) . – Anti-symmetry: x ≤ y and y ≤ x implies x = y . (i) If x (cid:22) N y , it should be y (cid:22) N x for y ≤ x to be true. By anti-symmetry of (cid:22) N , x = y . (ii) Otherwise, y (cid:54)(cid:22) N x and x (cid:22) y . For y ≤ x to be true, x (cid:54)(cid:22) N y and x (cid:22) y . By anti-symmetryof (cid:22) , x = y . (cid:117)(cid:116) emory-Eﬃcient Fixpoint Computation 29 Theorem 6.

For u, v ∈ V , if Inst [ v ] reads Post [ u ] , then u ≤ v .Proof. By the deﬁnition of the mapping

Inst , there must exists v (cid:48) ∈ V suchthat u v (cid:48) and v (cid:48) (cid:22) N v for Inst [ v ] to read Post [ u ] . By the deﬁnition of WTO,it is either u ≺ v (cid:48) and v (cid:48) / ∈ ω ( u ) , or v (cid:48) (cid:22) u and v (cid:48) ∈ ω ( u ) . In both cases, u ≤ v (cid:48) .Because v (cid:48) (cid:22) N v , and hence v (cid:48) ≤ v , u ≤ v . (cid:117)(cid:116) A.2 Optimality of M opt in § 3 This section presents the theorems and proofs about the optimality of M opt described in § 3. The theorem is divided into optimality theorems of the mapsthat constitute M opt .Given M ( Dpost , Achk , Dpost (cid:96) , Dpre (cid:96) ) and a map Dpost , we use M (cid:32) Dpost to denote the memory conﬁguration ( Dpost , Achk , Dpost (cid:96) , Dpre (cid:96) ). Simi-larly, M (cid:32) Achk means ( Dpost , Achk , Dpost (cid:96) , Dpre (cid:96) ), and so on. For agiven FM program P , each map X that constitutes a memory conﬁguration isvalid for P iﬀ M (cid:32) X is valid for every valid memory conﬁguration M . Also, X is optimal for P iﬀ M (cid:32) X is optimal for an optimal memory conﬁguration M . Theorem 7.

Dpost opt is valid. That is, given an FM program P and a validmemory conﬁguration M , (cid:74) P (cid:75) M (cid:32) Dpost opt = (cid:74) P (cid:75) M .Proof. Our approach does not change the iteration order and only changes wherethe deallocations are performed. Therefore, it is suﬃcient to show that for all u v , Post [ u ] is available whenever Inst [ v ] is executed.Suppose that this is false: there exists an edge u v that violates it. Let d be Dpost opt [ u ] computed by our approach. Then, the execution trace of P has execution of Inst [ v ] after the deallocation of Post [ u ] in Inst [ d ] , with noexecution of Inst [ u ] in between.Because ≤ is a total order, it is either d < v or v ≤ d . It must be v ≤ d ,because d < v implies d < v ≤ Lift ( u, v ) , which contradicts the deﬁnition of Dpost opt [ u ] . Then, by deﬁnition of ≤ , it is either v (cid:22) N d or ( d (cid:54)(cid:22) N v ) ∧ ( v (cid:22) d ) .In both cases, the only way Inst [ v ] can be executed after Inst [ d ] is to haveanother head h whose repeat instruction includes both Inst [ d ] and Inst [ v ] .That is, when d ≺ N h and v ≺ N h . By deﬁnition of WTO and u v , it is either u ≺ v , or u (cid:22) N v . It must be u ≺ v , because if u (cid:22) N v , Inst [ u ] is part of Inst [ v ] ,making Inst [ u ] to be executed before reading Post [ u ] in Inst [ v ] . Furthermore,it must be u ≺ h , because if h (cid:22) u , Inst [ u ] is executed before Inst [ v ] in eachiteration over C ( h ) . However, that implies h ∈ ( (cid:98)(cid:98) v (cid:101) (cid:22) N \ (cid:98)(cid:98) u (cid:101) (cid:22) N ) , which combinedwith d ≺ N h , contradicts the deﬁnition of Dpost opt [ u ] . Therefore, no such edge u v can exist and the theorem is true. (cid:117)(cid:116) Theorem 8.

Dpost opt is optimal. That is, given an FM program P , memoryfootprint of (cid:74) P (cid:75) M (cid:32) Dpost opt is smaller than or equal to that of (cid:74) P (cid:75) M for all validmemory conﬁguration M . Proof.

For

Dpost opt to be optimal, deallocation of

Post values must be de-termined at earliest positions as possible with a valid memory conﬁguration M (cid:32) Dpost opt . That is, there should not exists u, b ∈ V such that if d = Dpost opt [ u ] , b (cid:54) = d , M (cid:32) ( Dpost opt [ u ← b ]) is valid, and Inst [ b ] deletes Post [ u ] earlier than Inst [ d ] .Suppose that this is false: such u, b exists. Let d be Dpost opt [ u ] , computedby our approach. Then it must be b < d for Inst [ b ] to be able to delete Post [ u ] earlier than Inst [ d ] . Also, for all u v , it must be v ≤ b for Inst [ v ] to beexecuted before deleting Post [ u ] in Inst [ b ] .By deﬁnition of Dpost opt , v ≤ d for all u v . Also, by Theorem 6, u ≤ v .Hence, u ≤ d , making it either u (cid:22) N d , or ( d (cid:54)(cid:22) N u ) ∧ ( u (cid:22) d ) . If u (cid:22) N d ,by deﬁnition of Lift , it must be u d . Therefore, it must be d ≤ b , whichcontradicts that b < d . Alternative, if ( d (cid:54)(cid:22) N u ) ∧ ( u (cid:22) d ) , there must exist v ∈ V such that u v and Lift ( u, v ) = d . To satisfy v ≤ b , v (cid:22) N d , and b < d , it must be b (cid:22) N d . However, this makes the analysis incorrect becausewhen stabilization check fails for C ( d ) , Inst [ v ] gets executed again, attemptingto read Post [ u ] that is already deleted by Inst [ b ] . Therefore, no such u, b canexist, and the theorem is true. (cid:117)(cid:116) Theorem 9.

Achk opt is valid. That is, given an FM program P and a validmemory conﬁguration M , (cid:74) P (cid:75) M (cid:32) Achk opt = (cid:74) P (cid:75) M Proof.

Let v = Achk opt [ u ] . If v is a head, by deﬁnition of Achk opt , C ( v ) is thelargest component that contains u . Therefore, once C ( v ) is stabilized, Inst [ u ] can no longer be executed, and Pre [ u ] remains the same. If v is not a head,then v = u . That is, there is no component that contains u . Therefore, Pre [ u ] remains the same after the execution of Inst [ u ] . In both cases, the value passedto Ck u are the same as when using Achk dﬂt . (cid:117)(cid:116) Theorem 10.

Achk opt is optimal. That is, given an FM program P , memoryfootprint of (cid:74) P (cid:75) M (cid:32) Achk opt is smaller than or equal to that of (cid:74) P (cid:75) M for all validmemory conﬁguration M .Proof. Because

Pre value is deleted right after its corresponding assertions arechecked, it is suﬃcient to show that assertion checks are placed at the earliestpositions with

Achk opt .Let v = Achk opt [ u ] . By deﬁnition of Achk opt , u (cid:22) N v . For some b to performassertion checks of u earlier than v , it must satisfy b ≺ N v . However, becauseone cannot know in advance when a component of v would stabilize and when Pre [ u ] would converge, the assertion checks of u cannot be performed in Inst [ b ] .Therefore, our approach puts the assertion checks at the earliest positions, andit leads to the minimum memory footprint. (cid:117)(cid:116) Theorem 11.

Dpost (cid:96) opt is valid. That is, given an FM program P and a validmemory conﬁguration M , (cid:74) P (cid:75) M (cid:32) Dpost (cid:96) opt = (cid:74) P (cid:75) M . emory-Eﬃcient Fixpoint Computation 31 Proof.

Again, our approach does not change the iteration order and only changeswhere the deallocations are performed. Therefore, it is suﬃcient to show thatfor all u v , Post [ u ] is available whenever Inst [ v ] is executed.Suppose that this is false: there exists an edge u v that violates it. Let d (cid:48) be element in Dpost (cid:96) opt [ u ] that causes this violation. Then, the execution traceof P has execution of Inst [ v ] after the deallocation of Post [ u ] in Inst [ d (cid:48) ] , withno execution of Inst [ u ] in between. Because Post [ u ] is deleted inside the loopof Inst [ d (cid:48) ] , Inst [ v ] must be nested in Inst [ d (cid:48) ] or be executed after Inst [ d (cid:48) ] tobe aﬀected. That is, it must be either v (cid:22) N d (cid:48) or d (cid:48) ≺ v . Also, because of how Dpost (cid:96) opt [ u ] is computed, u (cid:22) N d (cid:48) .First consider the case v (cid:22) N d (cid:48) . By deﬁnition of WTO and u v , it iseither u ≺ v or u (cid:22) N v . In either case, Inst [ u ] gets executed before Inst [ v ] reads Post [ u ] . Therefore, deallocation of Post [ u ] in Inst [ d (cid:48) ] cannot cause theviolation.Alternatively, consider d (cid:48) ≺ v and v (cid:54)(cid:22) N d (cid:48) . Because u (cid:22) N d (cid:48) , Post [ u ] isgenerated in each iteration over C ( d (cid:48) ) , and the last iteration does not delete Post [ u ] . Therefore, Post [ u ] will be available when executing Inst [ v ] . Therefore,such u, d (cid:48) does not exists, and the theorem is true. (cid:117)(cid:116) Theorem 12.

Dpost (cid:96) opt is optimal. That is, given an FM program P , memoryfootprint of (cid:74) P (cid:75) M (cid:32) Dpost (cid:96) opt is smaller than or equal to that of (cid:74) P (cid:75) M for all validmemory conﬁguration M .Proof. Because one cannot know when a component would stabilize in advance,the decision to delete intermediate

Post [ u ] cannot be made earlier than thestabilization check of a component that contains u . Our approach makes suchdecisions in all relevant components that contains u .If u (cid:22) N d , Dpost (cid:96) opt [ u ] = (cid:98)(cid:98) u (cid:101) (cid:22) N ∩ (cid:98) d (cid:101)(cid:101) (cid:22) N . Because Post [ u ] is deleted in Inst [ d ] , we do not have to consider components in (cid:98)(cid:98) d (cid:101) (cid:22) N \ { d } . Alternatively, if u (cid:54)(cid:22) N d , Dpost (cid:96) opt [ u ] = (cid:98)(cid:98) u (cid:101) (cid:22) N \ (cid:98)(cid:98) d (cid:101) (cid:22) N . Because Post [ u ] is deleted Inst [ d ] , wedo not have to consider components in (cid:98)(cid:98) u (cid:101) (cid:22) N \ (cid:98)(cid:98) d (cid:101) (cid:22) N . Therefore, Dpost (cid:96) opt isoptimal. (cid:117)(cid:116)

Theorem 13.

Dpre (cid:96) opt is valid. That is, given an FM program P and a validmemory conﬁguration M , (cid:74) P (cid:75) M (cid:32) Dpre (cid:96) opt = (cid:74) P (cid:75) M .Proof. Pre [ u ] is only used in assertion checks and to perform widening in Inst [ u ] . Because u is removed from Dpre (cid:96) [ u ] , the deletion does not aﬀect widen-ing.For all v ∈ Dpre (cid:96) [ u ] , v (cid:22) N Achk opt [ u ] . Because Pre [ u ] is not deleted when C ( v ) is stabilized, Pre [ u ] will be available when performing assertion checks in Inst [ Achk opt [ u ]] . Therefore, Dpre (cid:96) is valid. (cid:117)(cid:116)

Theorem 14.

Dpre (cid:96) opt is optimal. That is, given an FM program P , memoryfootprint of (cid:74) P (cid:75) M (cid:32) Dpre (cid:96) opt is smaller than or equal to that of (cid:74) P (cid:75) M for all validmemory conﬁguration M . Proof.

Because one cannot know when a component would stabilize in advance,the decision to delete intermediate

Pre [ u ] cannot be made earlier than the sta-bilization check of a component that contains u . Our approach makes such de-cisions in all components that contains u . Therefore, Dpre (cid:96) opt is optimal. (cid:117)(cid:116)

Theorem 1.

The memory conﬁguration M opt ( Dpost opt , Achk opt , Dpost (cid:96) opt , Dpre (cid:96) opt ) is optimal.Proof. This follows from theorems Theorem 11 to 14. (cid:117)(cid:116)

A.3 Correctness and eﬃciency of

GenerateFMProgram in § 4

This section presents the theorems and proofs about the correctness and eﬃ-ciency of

GenerateFMProgram (Algorithm 1, § 4).

Theorem 2.

GenerateFMProgram correctly computes M opt , deﬁned in § 3.Proof. We show that each map is constructed correctly. – Dpost opt : Let v (cid:48) be the value of Dpost opt [ u ] before overwritten in Line 50,37, or 41. Descending post DFN ordering corresponds to a topological sortingof the nested SCCs. Therefore, in Line 50 and 37, v (cid:48) ≺ v . Also, because v (cid:22) N h for all v ∈ N h in Line 41, v (cid:48) (cid:22) N v . In any case, v (cid:48) ≤ v . Because rep ( v ) essentially performs Lift ( u, v ) when restoring the edges, the ﬁnal Dpost opt [ u ] is the maximum of the lifted successors, and the map is correctlycomputed. – Dpost (cid:96) opt : The correctness follows from the correctness of T . Because thecomponents are constructed bottom-up, rep ( u ) in Line 51 and 38 returnsmax (cid:22) N ( (cid:98)(cid:98) u (cid:101) (cid:22) N \ (cid:98)(cid:98) Dpost opt [ u ] (cid:101) (cid:22) N ) . Also, N ∗ = (cid:22) N . Thus, Dpost (cid:96) opt is cor-rectly computed. – Achk opt : At the end of the algorithm rep ( v ) is the head of maximal com-ponent that contains v , or v itself when v is outside of any components.Therefore, Achk opt is correctly computed. – Dpre (cid:96) opt : Using the same reasoning as in

Achk opt , and because N ∗ = (cid:22) N , Dpre (cid:96) opt is correctly computed. (cid:117)(cid:116)

Theorem 3.

Running time of

GenerateFMProgram is almost-linear.Proof.

The base WTO-construction algorithm is almost-linear time [27]. Thestarred lines in Algorithm 1 visit each edge and vertex once. Therefore, timecomplexity still remains almost-linear time. (cid:117)(cid:116) emory-Eﬃcient Fixpoint Computation 33

B Further experimental evaluation

Table 2: Measurements for benchmarks that took less than 5 seconds are sum-marized in the table below. Time diﬀ shows the runtime of IKOS minus thatof

Mikos (positive means speedup in

Mikos ). Mem diﬀ shows the memoryfootprint of IKOS minus that of

Mikos (positive means memory reduction in

Mikos ). < s Time (s) Memory (MB) Time diﬀ (s) Memory diﬀ (MB)min. max. avg. min. max. avg. min. max. avg. min. max. avg.T1 0.11 4.98 0.58 25 564 42 -0.61 +1.44 +0.08 -0.37 +490 +12T2 0.06 4.98 1.07 9 218 46 -0.05 +1.33 +0.14 -0.43 +172 +18 Table 3:

Task T1.

A sample of the results for task T1 in Figure 3(a), exclud-ing the non-completed benchmarks in IKOS. The ﬁrst 5 rows list benchmarkswith the smallest memory reduction ratio (MRR)s. The latter 5 rows list bench-marks with the largest memory footprints. The smaller the MRR, the greaterthe reduction in memory footprint. T: time; MF: memory footprint.

IKOS

Mikos

Benchmark T (s) MF (MB) T (s) MF (MB) MRR3.16-rc1/205_9a-net-rtl8187 1500 45905 1314 56 0.0014.2-rc1/43_2a-mmc-rtsx 786.5 26909 594.8 42 0.0024.2-rc1/43_2a-video-radeonfb 2494 56752 1930 107 0.0024.2-rc1/43_2a-net-skge 3523 47392 3131 98 0.0024.2-rc1/43_2a-usb-hcd 220.4 17835 150.8 39 0.0024.2-rc1/32_7a-target_core_mod 1316 60417 1110 2967 0.049challenges/3.14-alloc-libertas 2094 60398 1620 626 0.0104.2-rc1/43_2a-net-libertas 1634 59902 1307 307 0.005challenges/3.14-kernel-libertas 2059 59826 1688 2713 0.0453.16-rc1/43_2a-sound-cs46xx 3101 58087 2498 193 0.0034 S. Kim et al.

Table 4:

Task T1.

A sample of the results for task T1 in Figure 3(b). The ﬁrst3 rows list benchmarks with lowest speedups. The latter 3 rows list benchmarkswith highest speedups. T: time; MF: memory footprint.

IKOS

Mikos

Benchmark T (s) MF (MB) T (s) MF (MB) MRR Speedupchallenges/3.8-usb-main11 42.63 541 48.92 122 0.225 0.87 × challenges/3.8-usb-main0 54.31 3025 61.78 190 0.063 0.88 × challenges/3.8-usb-main1 42.84 457 47.73 119 0.261 0.90 × × × × Table 5:

Task T2.

A sample of the results for task T2 in Figure 5(a), exclud-ing the non-completed benchmarks in IKOS. The ﬁrst 5 rows list benchmarkswith the smallest memory reduction ratio (MRR)s. The latter 5 rows list bench-marks with the largest memory footprints. The smaller the MRR, the greaterthe reduction in memory footprint. T: time; MF: memory footprint.

IKOS

Mikos

Benchmark T (s) MF (MB) T (s) MF (MB) MRRlxsession-0.5.4/lxsession 146.1 5831 81.57 130 0.022rox-2.11/ROX-Filer 362.3 9569 400.6 329 0.034tor-0.3.5.8/tor-resolve 58.36 1930 53.10 70 0.036openssh-8.0p1/ssh-keygen 1212 29670 1170 1128 0.038xsane-0.999/xsane 499.8 10118 467.5 430 0.042openssh-8.0p1/sftp 3036 45903 3446 9137 0.199metacity-3.30.1/metacity 2111 36324 2363 6329 0.174links-2.19/links 2512 29761 2740 3930 0.132openssh-8.0p1/ssh-keygen 1212 29670 1170 1128 0.038links-2.19/xlinks 2523 29587 2760 3921 0.133

Table 6:

Task T2.

A sample of the results for task T2 in Figure 5(b). The ﬁrst3 rows list benchmarks with lowest speedups. The latter 3 rows list benchmarkswith highest speedups. T: time; MF: memory footprint.

IKOS

Mikos

Benchmark T (s) MF (MB) T (s) MF (MB) MRR Speedupmoserial-3.0.12/moserial 422.3 109 585.5 107 0.980 0.72 × openssh-8.0p1/ssh-pkcs11-helper 82.70 674 94.61 613 0.910 0.87 × openssh-8.0p1/sftp 3036 45903 3446 9137 0.199 0.88 × packeth-1.9/packETH 188.7 153 83.82 120 0.782 2.25 × lxsession-0.5.4/lxsession 146.1 5831 81.57 130 0.022 1.79 × xscreensaver-5.42/braid 6.48 203 4.87 36 0.179 1.33xscreensaver-5.42/braid 6.48 203 4.87 36 0.179 1.33