Inductive Synthesis for Probabilistic Programs Reaches New Horizons
Roman Andriushchenko, Milan Ceska, Sebastian Junges, Joost-Pieter Katoen
CC o n s i s t e n t * C o m p l e t e * W e l l D o c u m e n t e d * E a s y t o R e u s e * * E v a l u a t e d * T A C A S * A r t i f a c t * A E C Inductive Synthesis for Probabilistic ProgramsReaches New Horizons (cid:63)
Roman Andriushchenko , Milan ˇCeˇska ( (cid:66) ) ,Sebastian Junges , and Joost-Pieter Katoen Brno University of Technology, Brno, Czech Republic [email protected] University of California, Berkeley, USA RWTH Aachen University, Aachen, Germany
Abstract.
This paper presents a novel method for the automated syn-thesis of probabilistic programs. The starting point is a program sketchrepresenting a finite family of finite-state Markov chains with related butdistinct topologies, and a reachability specification. The method builds ona novel inductive oracle that greedily generates counter-examples (CEs)for violating programs and uses them to prune the family. These CEsleverage the semantics of the family in the form of bounds on its best-and worst-case behaviour provided by a deductive oracle using an MDPabstraction. The method further monitors the performance of the synthe-sis and adaptively switches between inductive and deductive reasoning.Our experiments demonstrate that the novel CE construction providesa significantly faster and more effective pruning strategy leading to anaccelerated synthesis process on a wide range of benchmarks. For challeng-ing problems, such as the synthesis of decentralized partially-observablecontrollers, we reduce the run-time from a day to minutes.
Background and motivation.
Controller synthesis for Markov decision processes(MDPs [35]) and temporal logic constraints is a well-understood and tractableproblem, with a plethora of mature tools providing efficient solving capabilities.However, the applicability of these controllers to a variety of systems is limited:Systems may be decentralized, controllers may not be able to observe the completesystem state, cost constraints may apply, and so forth. Adequate operationalmodels for these systems exist in the form of decentralized partially-observableMDPs (DEC-POMDPs [33]). The controller synthesis problem for these modelsis undecidable [30], and tool support (for verification tasks) is scarce.This paper takes a different approach: the controller together with the en-vironment can be modelled as probabilistic program sketches where “holes” inthe probabilistic program model choices that the controller may make. Concep-tually, the controllers of the DEC-POMDP are described by a user-defined finite (cid:63)
This work has been partially supported by the Czech Science Foundation grantGJ20-02328Y and the ERC AdG Grant 787914 FRAPPANT, the NSF grants 1545126(VeHICaL) and 1646208, by the DARPA Assured Autonomy program, by BerkeleyDeep Drive, and by Toyota under the iCyPhy center. a r X i v : . [ c s . L O ] J a n R. Andriushchenko et al. family M of Markov chains. The synthesis problem that we consider is to finda Markov chain M (i.e., a probabilistic program) in the family M , such that M | = ϕ , where ϕ is the specification. To allow efficient algorithms, the family musthave some structure. In particular, in our setting, the family is parameterizedby a set of discrete parameters K ; an assignment K → V of these parameterswith concrete values V from its associated domain yields a family member, i.e.,a Markov chain (MC). Such a parameterization is naturally obtained from theprobabilistic program sketch, where some constants (or program parts) can beleft open. The search for a family member can thus be considered as the searchfor a hole-assignment. This approach fits within the realm of syntax-guidedsynthesis [2]. Motivating example. Herman’s protocol [24] is a well-studied randomized dis-tributed algorithm aimed to obtain fast stabilization on average. In [26], afamily M of MCs is used to model different protocol instances. They consideredeach instance separately, and found which of the controllers for Herman’s protocolperforms best. Let us consider the protocol in a bit more detail: It considersself-stabilization of a unidirectional ring of network stations where all stationshave to behave similarly—an anonymous network. Each station stores a single bit,and can read the internal bit of one (say left) neighbour. To achieve stabilization,a station for which the two legible bits coincide updates its own bit based onthe outcome of a coin flip. The challenge is to select a controller that flips thiscoin with an optimal bias, i.e., minimizing the expected time until stabilization.In a setting where the probabilities range over 0 . , . , . . . , .
9, this results inanalyzing nine different MCs. Does the expected time until stabilization reduceif the controllers are additionally allowed to have a single bit of memory? Inevery step, there are 9 · · · ,
368 models. Eventually, analyzing all individual MCs is infeasible.
Oracle-guided synthesis.
To tackle the synthesis problem, we introduce an oracle-guided inductive synthesis approach [25,39]. A learner selects a family member andpasses it to the oracle. The oracle answers whether the family member satisfies ϕ ,and crucially, gives additional information in case this is not the case. Inspiredby [9], if the family member violates the specification ϕ , our oracle returns a set K (cid:48) of parameters such that all family members obtained by changing only thevalues assigned to K (cid:48) violate ϕ . We argue that such an oracle must (1) inducelittle overhead in providing K (cid:48) , (2) be aware of the existence of parameters inthe family, and (3) have (resemblance of) awareness about the semantics of theparameters and their values. Oracles.
With these requirements in mind, we construct a counterexample (CE)-based oracle from scratch. We do so by carefully exploiting existing methods.We construct critical subsystems as CEs [1]. Critical subsystems are parts of nductive Synthesis for Probabilistic Programs Reaches New Horizons 3 the MC that suffice to refute the specification. If a hole is absent in a CE,its value is irrelevant. To avoid the cost of finding optimal CEs—an NP-hardproblem [19]—we consider greedy CEs that are similar to [9]. However, our greedyCEs are aware of the parameters, and try to limit the occurrence of parametersin the CE. Finally, to provide awareness of the semantics of parameter values,we provide lower and upper bounds on all states: Their difference indicates howmuch varying the value at a hole may change the overall reachability probability.These bounds are efficiently computed by another oracle. This oracle analyses aquotient MDP obtained by employing an abstraction method that is part of theabstraction-refinement loop in [10].
A hybrid variant.
The two oracles are significantly different. Abstraction refine-ment is deductive : it argues about single family members by considering (anaggregation of) all family members. The critical subsystem oracle is inductive :by examining a single family member, it infers statements about other familymembers. This suggests a middle ground: a hybrid strategy monitors the per-formance of the two oracles during the synthesis and suggests their best usage.More precisely, the hybrid strategy integrates the counterexample-based oracleinto the abstraction-refinement loop.
Major results.
We present a novel and dedicated oracle deployed in an efficacioussynthesis loop. We use model-checking results on an abstraction to tailor smallerCEs. Our greedy and family-aware CE construction is substantially faster thanthe use of optimal CEs. Together, these two improvements yield CEs that are onpar with optimal CEs, but are found much faster. The integration of multipleabstraction-refinement steps yields a superior performance:x We compare ourperformance with the abstraction-refinement loop from [10] using benchmarksfrom [10]. Benchmarks can be classified along two dimensions: ( A ) Benchmarkswith a structure good for CE-generation. ( B ) Benchmarks with a structure goodfor abstraction-refinement. A-benchmarks are a natural strength of our noveloracle. Our simple, efficient hybrid strategy significantly outperforms the state-of-the-art on A -benchmarks, while it only yields limited overhead for B -benchmarks.Most importantly, the novel hybrid strategy can solve benchmarks that areout of reach for pure abstraction-refinement or pure CE-based reasoning. Inparticular, our hybrid method is able to synthesize the optimal Herman protocolwith memory—the synthesis time on a design space with 3.1 millions of candidateprograms reduces from a day to minutes. Related work
The synthesis problems for parametric probabilistic systems canbe divided into the following two categories.
Topology synthesis, akin to the problem considered in this paper, assumes a finiteset of parameters affecting the MC topology. Finding an instantiation satisfyinga reachability property is NP-complete in the number of parameters [12], andcan naively be solved by analyzing all individual family members. An alternativeis to model the MC family by an MDP and resort to standard MDP model-checking algorithms. Tools such as ProFeat [13] or QFLan [40] take this approach
R. Andriushchenko et al. to quantitatively analyze alternative designs of software product lines [21,28].These methods are limited to small families. This motivated (1) abstraction-refinement over the MDP representation [10], and (2) counterexample-guidedinductive synthesis (CEGIS) for MCs [9], mentioned earlier. The alternativeproblem of sketching for probabilistic programs that fit given data is studied,e.g., in [32,38].
Parameter synthesis considers models with uncertain parameters associated totransition probabilities, and analyses how the system behaviour depends onthe parameter values. The most promising techniques are based on parameterlifting that treats identical parameters in different transitions independently [8,36]and has been implemented in the state-of-the-art probabilistic model checkersStorm [18] and PRISM [27]. An alternative approach based on building rationalfunctions for the satisfaction probability has been proposed in [15] and furtherimproved in [22,17,4]. This approach has been also applied to different problemssuch as model repair [5,34,11].Both synthesis problems can be also attacked by search-based techniques thatdo not ensure an exhaustive exploration of the parameter space. These includeevolutionary techniques [23,31] and genetic algorithms [20]. Combinations withparameter synthesis have been used [7] to synthesize robust systems.
We formalize the essential ingredients and the problem statement. See [3] formore material.
Sets of Markov chains. A (discrete) distribution over a finite set X is a function µ : S → [0 ,
1] s.t. (cid:80) x µ ( x ) = 1. The set Distr ( X ) contains all distributions over X . The support of µ ∈ Distr ( X ) is supp( µ ) = { x ∈ X | µ ( x ) > } . Definition 1 (MC). A Markov chain (MC) is a tuple D = ( S, s , P ) , where S is a finite set of states , s ∈ S is an initial state , and P : S → Distr ( S ) isa transition probability function . We write P ( s, t ) to denote P ( s )( t ) . The state s is absorbing if P ( s, s ) = 1 . Let K denote a finite set of discrete parameters with finite domain V k . Forbrevity, we often assume that all domains are the same, and omit the subscript k . A realization r maps parameters to values in their domain, i.e., r : K → V .Let R D denote the set of all realizations of a set D of MCs. A K -parameterizedset of MCs D ( K ) contains the MCs D r , for every r ∈ R D . In Sect. 3, we give anoperational model for such sets. In particular, realizations will fix the targets oftransitions. In our experiments, we describe these sets using the PRISM modellinglanguage where parameters are described by undefined integer values. Properties and specifications.
For simplicity, we consider (unbounded) reach-ability properties . For a set T ⊆ S of target states , let P [ D, s | = ♦ T ] denote Our implementation also supports expected reachability rewards.nductive Synthesis for Probabilistic Programs Reaches New Horizons 5 the probability in MC D to eventually reach some state in T when startingin the state s ∈ S . A property ϕ ≡ P (cid:46)(cid:47)λ [ ♦ T ] with λ ∈ [0 ,
1] and (cid:46)(cid:47) ∈ {≤ , ≥} expresses that the probability to reach T does relate to λ according to (cid:46)(cid:47) . If (cid:46)(cid:47) = ≤ , then ϕ is a safety property; otherwise, it is a liveness property. Formally,state s in MC D satisfies ϕ if P [ D, s | = ♦ T ] ≥ λ . The MC D satisfies ϕ if theabove holds for its initial state. A specification is a set of properties Φ = { ϕ i } i ∈ I ,and D | = Φ if ∀ i ∈ I : D | = ϕ i . Problem statement.
The key problem statement in this paper is feasibility :Given a parameterized set of Markov chains D ( K ) over parameters K anda specification Φ , find a realization r : K → V such that D r | = Φ .When D is clear from the context, we often write r | = Φ to denote D r | = Φ .We additionally consider the optimizing variant of the synthesis problem.The maximal synthesis problem asks: given a maximizing property ϕ max ≡ P (cid:46)(cid:47)λ [ ♦ T ], identify r ∗ ∈ arg max r ∈R D { P [ D r | = ♦ T ] | D r | = Φ } provided it exists.The minimal synthesis problem is defined analogously.As the state space S , the set K of parameters, and their domains are all finite,the above synthesis problems are decidable. One possible solution, called the one-by-one approach [14], considers each realization r ∈ R D . The state-space andparameter-space explosion renders this approach unusable for large problems,necessitating the usage of advanced techniques that exploit the family structure. In this section, we recap a baseline for a counterexample-guided inductive syn-thesis (CEGIS) loop, as put forward in [9]. In particular, we first instantiate anoracle-guided synthesis method, discuss an operational model for families, givingstructure to the parameterized set of Markov chains, and finally detail the usageof CEs to create an oracle. Learner Oracle
R D , Φ r ∈ R r ∈ R (cid:48) ⊆ R , R (cid:48) all violate Φ r | = Φ no r | = Φ Fig. 1.
Oracle-guided synthesis
Consider Fig. 1. A learner takes aset R of realizations, and has to find arealization D r satisfying the specifica-tion Φ . The learner maintains (a sym-bolic representation of) a set Q ⊆ R of realizations that need to be checked.It iteratively asks the oracle whethera particular r ∈ Q is a solution. If it isa solution, the oracle reports success.Otherwise, the oracle returns a set R (cid:48) containing r and potentially more realiza-tions all violating Φ . The learner then prunes R (cid:48) from Q . In Section 4, we focuson creating an efficient oracle that computes a set R (cid:48) (with r ∈ R (cid:48) ) of realizationsthat are all violating Φ . In Section 5, we provide a more advanced frameworkthat extends this method. The remainder of this section lays the groundwork forthese sections. R. Andriushchenko et al.
Families of Markov chains
To avoid the need to iterate over all realizations,an efficient oracle exploits some structure of the family. In this paper, we focus onsets of Markov chains having different topologies. We explain our concepts usingthe operational model of families given in [10]. Our implementation supports(more expressive) PRISM programs with undefined integer constants.
Definition 2 (Family of MCs). A family of MCs is a tuple D = ( S, s , K, B ) with S and s as before, K is a finite set of parameters with domains V k ⊆ S foreach k ∈ K , and B : S → Distr ( K ) is a family of transition probability functions. Function B of a family D of MCs maps each state to a distribution over parame-ters K . In the context of the synthesis of probabilistic models, these parametersrepresent unknown options or features of a system under design. Realizations arenow defined as follows. Definition 3 (Realization). A realization of a family D = ( S, s , K, B ) of MCsis a function r : K → S s.t. r ( k ) ∈ V k , for all k ∈ K . We say that realization r induces MC D r = ( S, s , B r ) iff B r ( s, s (cid:48) ) = (cid:80) k ∈ K,r ( k )= s (cid:48) B ( s )( k ) for any pair ofstates s, s (cid:48) ∈ S . The set of all realizations of D is denoted as R D . The set R D = (cid:81) k ∈ K V k of all possible realizations is exponential in | K | . Counterexample-guided oracles
We first consider the feasibility synthesisfor a single-property specification and later, cf. Remark 1, generalize this tomultiple properties and to optimal synthesis. The notion of counterexamples isat the heart of the oracle from [9] and Sect. 4.If an MC D (cid:54)| = ϕ , a counterexample (CE) based on a critical subsystem canserve as diagnostic information about the source of the failure. We consider thefollowing CE, motivated by the notion of critical subsystem in [37]. Definition 4 (Counterexample).
Let D = ( S, s , P ) be an MC with s ⊥ (cid:54)∈ S .The sub-MC of D induced by C ⊆ S is the MC D ↓ C = ( S ∪ { s ⊥ } , s , P (cid:48) ) , wherethe transition probability function P (cid:48) is defined by: P (cid:48) ( s ) = (cid:40) P ( s ) if s ∈ C, [ s ⊥ (cid:55)→ otherwise . The set C and the sub-MC D ↓ C are called a counterexample (CE) for the property P ≤ λ [ ♦ T ] on MC D , if D ↓ C (cid:54)| = P ≤ λ [ ♦ ( T ∩ ( C ∪ { s } ))] . Let D r be an MC violating the specification ϕ . To compute other realizationsviolating ϕ , the oracle computes a critical subsystem D r ↓ C , which is then usedto deduce a so-called conflict for D r and ϕ . Definition 5 (Conflict).
For family of MCs D = ( S, s , K, B ) and C ⊆ S , theset K C of relevant parameters (called conflict ) is given by (cid:83) s ∈ C supp( B ( s )) . nductive Synthesis for Probabilistic Programs Reaches New Horizons 7 Fig. 2.
Counterexamples for smaller conflicts.
It is straightforward to compute a set of violating realizations from a conflict. A generalization of realization r induced by the set K C ⊆ K of relevant parametersis the set r ↑ K C = { r (cid:48) ∈ R | ∀ k ∈ K C : r ( k ) = r (cid:48) ( k ) } . We often use the term conflict to refer to its generalization. The size of a conflict, i.e., the number | K C | of relevant parameters K C is crucial. Small conflicts potentially lead togeneralizing r to larger subfamilies r ↑ K C . It is thus important that the CEscontain as few parameterized transitions as possible. The size of a CE in termsof the number of states is not of interest. Furthermore, the overhead of providingCEs should be bounded from below by the payoff: Finding a large generalizationmay take some time, but small generalizations should be returned quickly. TheCE-based oracle in [9] uses an off-the-shelf CE procedure [16,41], and mostlydoes not provide small CEs. This section develops an oracle based on CEs, tailored for the use in an oracle-guided inductive synthesis loop described in Sect. 3. Its main features are: – a fast greedy approach to compute CEs that provide small conflicts: Weachieve this by taking into account the position of the parameters. – awareness about the semantics of parameters by using model-checking resultsfrom an abstraction of the family.Before going into details, we provide some illustrative examples. A motivating example
First, we illustrate what it means to take CEs thatlead to small conflicts. Consider Fig. 2, with a family member D r (left), wherethe superscript of a state identifier s i denotes parameters relevant to s i . Considerthe safety property ϕ ≡ P ≤ . [ ♦ { t } ]. Clearly, D r (cid:54)| = ϕ , and we can constructtwo CEs: C = { s , s , t } (center) and C = { s , s , s , t } (right) with conflicts K C = { X, Y } and K C = { X } , respectively. It illustrates that a smaller CEdoes not necessarily induce a smaller conflict.We now illustrate awareness of the semantics of parameters. Consider thefamily D = ( S, s , K (cid:48) , B ), where S = { s , s , s , t, f } , the parameters are K (cid:48) = { X, Y, T (cid:48) , F (cid:48) } with domains V X = { s , s } , V Y = { t, f } , V T (cid:48) = { t } , V F (cid:48) = { f } ,and a family B of transition probability functions defined in Fig. 3 (left). As the R. Andriushchenko et al. B ( s ) = [ X (cid:55)→ , B ( s ) = [ T (cid:48) (cid:55)→ . , Y (cid:55)→ . , F (cid:48) (cid:55)→ . , B ( s ) = [ T (cid:48) (cid:55)→ . , Y (cid:55)→ . , F (cid:48) (cid:55)→ . , B ( t ) = [ T (cid:48) (cid:55)→ , B ( f ) = [ F (cid:48) (cid:55)→ Fig. 3.
A family D of four Markov chains (unreachable states are grayed out). parameters T (cid:48) and F (cid:48) each can take only one value, we consider K = { X, Y } as the set of parameters. There are | V X | × | V Y | = 4 family members, depictedin Fig. 3(right). For conciseness, we omit some of the transition probabilities(recall that transition probabilities sum to one). Only realization r satisfies thesafety property ϕ ≡ P ≤ . [ ♦ { t } ]. CEGIS [9] illustrated : Consider running CEGIS, and assume the oracle getsrealization r first. A model checker reveals P [ D r , s | = ♦ T ] = 0 . > .
3. TheCE for D r and ϕ contains the (only) path to the target: s → s → t havingprobability 0 . > .
3. The corresponding CE C = { s , s , t } induces the conflict K C = { X, Y } . None of the parameters is generalized. The same argument appliesto any subsequent realization: the constructed CEs do not allow for generalization,the oracle returns only the passed realization, and the learner keeps iteratinguntil accidentally guessing r . Can we do better?
To answer this, consider CE generation as a game: ThePruner creates a critical subsystem C . The Adversary wins if it finds a MCsatisfying ϕ containing C , thus refuting that C is a counterexample. In oursetting, we change the game: The Adversary must select a family member ratherthan an arbitrary MC. Analogously, off-the-shelf CE generators construct acritical subsystem C that for every possible extension of C is a CE. Theseare CEs without context . In our game, the Adversary may not extend the MCarbitrarily, but must choose a family member. These are
CEs modulo a family . Back to the example:
Observe that for a CE for D r , we could omit states t and s from the set C of critical states: we know for sure that, once D r takestransition ( s , s ), it will reach target state t with probability at least 0 .
6. Thisexceeds the threshold 0 .
3, regardless of the value of the parameter Y . Hence, forfamily D , the set C (cid:48) = { s } is a critical subsystem. The immediate advantage isthat this set induces conflict K C (cid:48) = { X } (parameter Y has been generalized).This enables us to reject all realizations from the set r ↑ K C (cid:48) = { r , r } . It is‘easier’ to construct a CE for a (sub)family than for arbitrary MCs . More generally,a successful oracle needs to have access to useful bounds, and effectively integratethem in the CE generation. nductive Synthesis for Probabilistic Programs Reaches New Horizons 9
Counterexample construction
We develop an algorithm using bounds onreachability probabilities, similar to the bounds used above. Let us assume that forsome set of realizations R and for every state s , we have bounds lb R ( s ) , ub R ( s ),such that for every r ∈ R we have lb R ( s ) ≤ P [ D r , s | = ♦ T ] ≤ ub R ( s ). Suchbounds always exist (take 0 and 1). We see later how we compute these bounds.In what follows, we fix r and denote D r = ( S, s , P ). Let us assume D r violatesa safety property ϕ ≡ P ≤ λ [ ♦ T ]. The following definition is central: Definition 6 (Rerouting).
Let MC D = ( S, s , P ) with s (cid:62) , s ⊥ (cid:54)∈ S , C ⊆ S a set of expanded states and γ : S \ C → [0 , a rerouting vector . The rerouting of MC D w.r.t. C and γ is the MC D ↓ C [ γ ] = ( S ∪ { s ⊥ , s (cid:62) } , s , P C γ ) with: P C γ ( s ) = P ( s ) if s ∈ C, [ s (cid:62) (cid:55)→ γ ( s ) , s ⊥ (cid:55)→ (1 − γ ( s ))] if s ∈ S \ C, [ s (cid:55)→ if s ∈ { s (cid:62) , s ⊥ } . Essentially, D ↓ C [ γ ] extends the MC D with additional sink states s (cid:62) and s ⊥ and replaces all outgoing transitions of any non-expanded state s ∈ S \ C bya transition leading to s (cid:62) (with probability γ ( s )) and a complementary one to s ⊥ .We consider s (cid:62) to be the new target and let ϕ (cid:48) denote the updated property. Thetransition s γ ( s ) −−−→ s (cid:62) may be considered a ‘shortcut’ that by-passes successors of s and leads straight to target s (cid:62) with probability γ ( s ). To ensure that D ↓ C [ γ ]is a CE, the value γ ( s ) must be a lower bound on the reachability probabilityfrom s in D . When constructing a CE for a singular MC, we pick γ = , whereaswhen this MC is induced by a realization r ∈ R , we can safely pick γ = lb R . TheCE will be valid for every r (cid:48) ∈ R . It is a CE-modulo- R .Algorithmically, we employ a state-exploration approach and therefore startwith C (0) = ∅ , i.e., all states are initially rerouted. If this is a CE, we aredone. Otherwise, if the rerouting D ↓ C (0) [ γ ] satisfies ϕ (cid:48) , then we ‘expand’ somestates to obtain a CE. Naturally, we must expand reachable states to change thesatisfaction of ϕ . By expanding some state s ∈ S , we abandon the abstractionassociated with the shortcut s γ ( s ) −−−→ s (cid:62) and replace it with concrete behavior thatwas inherent to state s in MC D . Expanding a state cannot decrease the inducedreachability probability as lb R is a valid lower bound. This gradual expansionof the reachable state space continues until for some C ⊆ S the correspondingrerouting D ↓ C [ γ ] violates ϕ (cid:48) . This gradual expansion process terminates as D ↓ S [ γ ] ≡ D and our assumption is D (cid:54)| = ϕ . We show this process on an example. Example 1.
Reconsider D in Fig. 3 with ϕ ≡ P ≤ . [ ♦ { t } ]. Using the methodoutlined below we get: lb R = [ s (cid:55)→ . , s (cid:55)→ . , s (cid:55)→ . , t (cid:55)→ , f (cid:55)→ { s , s , t } . Consider the gradual reroutingapproach: We set γ = lb R , C (0) = ∅ and have D (0) := D r ↓ C (0) [ γ ], see Fig. 4(a).Verifying this MC against ϕ (cid:48) = P ≤ . [ ♦ T ∪{ s (cid:62) } ] yields P [ D (0) , s | = ♦ T ∪{ s (cid:62) } ] = γ ( s ) = 0 . ≤ .
3, i.e., the set C (0) is not a CE. We now expand the initial state,i.e., C (1) = { s } and let D (1) := D r ↓ C (1) [ γ ], see Fig. 4(b). Verifying D (1) yields P [ D (1) , s | = ♦ T ∪ { s (cid:62) } ] = 1 · γ ( s ) = 0 . > .
3. Thus, the set C (1) is critical Fig. 4.
Finding a CE to D r and ϕ from Fig. 3 using the rerouting vector γ = lb R . Algorithm 1:
Counterexample construction based on rerouting.
Input :
An MC D r a property ϕ ≡ P (cid:46)(cid:47)λ [ ♦ T ] s.t. D r (cid:54)| = ϕ , a rerouting vector γ . Output :
A conflict K for D r and ϕ . i ← K ( i ) ← ∅ while true do C ( i ) , H ( i ) ← reachableViaHoles( D r , K ( i ) ) D ( i ) ← D r ↓ C ( i ) [ γ ] if P [ D ( i ) | = ♦ T ∪ { s (cid:62) } ] (cid:54) (cid:46)(cid:47) λ then return K ( i ) ; s ← chooseToExpand( H ( i ) , K ( i ) ) K ( i +1) = K ( i ) ∪ supp( B ( s )) i ← i + 1 end while and the corresponding conflict is K C (1) = supp( s ) = { X } . This is smaller thanthe naively computed conflict { X, Y } . Greedy state expansion strategy
Recall from Fig. 2 that for an MC D r with D r (cid:54)| = ϕ , multiple CEs may exist inducing different conflicts. An efficient expansionstrategy should yield a CE that induces a small amount of relevant parameters(to prune more family members) and this CE is preferably obtained by a smallnumber of model-checking queries. The method presented in Alg. 1 meets thesecriteria. The algorithm expands multiple states between subsequent model checks,while expanding only states that are associated with parameters that are relevant.In particular, in each iteration, we keep track of the set K ( i ) of relevant parametersoptimistically starting with K (0) = ∅ . We compute (see line 3) the set C ( i ) ofstates that are reachable from the initial state via states which are associated onlywith relevant parameters in K ( i ) , i.e., via states for which supp( B ( s )) ⊆ K ( i ) .Here, H ( i ) represents a state exploration ‘horizon’: the set of states reachablefrom C ( i ) but containing some (still) irrelevant parameters. We then constructthe corresponding rerouting D ↓ C ( i ) [ γ ] and check whether it is a CE. Otherwise,we greedily choose a state s from the horizon H ( i ) containing the least numberof irrelevant parameters and add these parameters to our conflict (see line 7). nductive Synthesis for Probabilistic Programs Reaches New Horizons 11Learner CE-OracleAbstr-Oracle R D , Φ D , Φ r ∈ R +bounds R (cid:48) ⊆ R violate Φ R (cid:48) ⊆ R bounds or R (cid:48) violates r | = Φ each r ∈ R (cid:48) , r | = Φ no r | = Φ Fig. 5.
Conceptual hybrid (dual-oracle) synthesis.
The resulting conflict may not be minimal, but is computed fast. Our algorithmapplies to probabilistic liveness properties too using γ = ub R . Computing bounds
We compute lb R and ub R using an abstraction [10]. Themethod considers some set R of realizations and computes the corresponding quotient Markov decision process (MDP) that over-approximates the behavior ofall MCs in the family R . Model checking this MDP yields an upper and a lowerbound of the induced probabilities for all states over all realizations in R . Thatis, Bound ( D , R ) computes lb R ∈ R S and ub R ∈ R S such that for each s ∈ S : lb R ( s ) ≤ min r ∈R P [ D r , s | = ♦ T ] ≤ max r ∈R P [ D r , s | = ♦ T ] ≤ ub R ( s ) . To allow for refinement, two properties are crucial (with point-wise inequalities):1. lb R ≤ lb R (cid:48) ∧ ub R ≥ ub R (cid:48) for R (cid:48) ⊆ R and 2. lb { r } = ub { r } for r ∈ R . In [10], the abstraction and refinement together define an abstraction-refinementloop (AR) that addresses the feasibility problem. In the worst case, this loopanalyses 2 · |R| quotient MDPs, which (as of now) may be arbitrarily larger thanthe number of family members they represent.
We introduce an extended synthesis loop in which the abstraction-based reasoningis used to prune the family R , and to accelerate the CE-based oracle from Sect. 4.The intuitive idea is outlined in Fig. 5. Note that if the CE-based oracle is notexploited, we emulate AR (explained in computing bounds above), whereas ifthe abstraction oracle is not used, we emulate CEGIS (with the novel oracle).Let us motivate combining these oracles in a flexible way. The naive versionoutlined in the previous section assumed a single abstraction step, and invokesCEGIS with the bounds obtained from that step. Evidently, the better (tighter)the bounds γ , the better the CEs. However, the abstraction-based bounds for R may be very loose. These bounds can be improved by splitting the set R andusing the bounds on the two sub-families. The idea is to run a limited number of Some care is required regarding loops, see [9].2 R. Andriushchenko et al.
Algorithm 2:
Hybrid (dual-oracle) synthesis.
Input :
A family D , a reachability property ϕ . Output :
Either a member r in D with r | = ϕ , or no such r exists in D R ← {R D } ; // each analysed (sub-)family also holds bounds δ CEGIS ← // time allocation factor for CEGIS while true do result, R (cid:48) , σ AR , t AR ← AR.run( R , ϕ ) if result. decided() then return result ; CEGIS.setTimeout( t AR · δ CEGIS ) result, σ CEGIS , R (cid:48)(cid:48) ← CEGIS . run( R (cid:48) , ϕ ) if result. decided() then return result ; δ CEGIS ← σ CEGIS /σ AR R ← R (cid:48)(cid:48) end while AR steps and then invoke CEGIS. Our experiments reveal that it can be crucialto be adaptive, i.e., the integrated method must be able to detect at run timewhen to switch.The proposed hybrid method switches between AR and CEGIS, where weallow for refining during the AR phase and use the obtained refined boundsduring CEGIS. Additionally, we estimate the efficiency σ (e.g., the number ofpruned MCs per time unit) of the two methods and allocate more time t to themethod with superior performance. That is, if we detect that CEGIS prunessub-families twice as fast as AR, we double the time in the next round forCEGIS. The resulting algorithm is summarized in Alg. 2. Recall that AR (atline 5) takes one family from R , either solves it or splits it and returns the setof undecided families R (cid:48) . In contrast, CEGIS processes multiple families from R (cid:48) until the timeout and then returns the set of undecided families R (cid:48)(cid:48) . Thisworkflow is motivated by the fact that one iteration of AR (i.e., the involvedMDP model-checking) is typically significantly slower that one CEGIS iteration. Remark 1.
Although the developed framework for integrated synthesis has beendiscussed in the context of feasibility with respect to a single property ϕ , itcan be easily generalized to handle multiple -property specifications as well asto treat optimal synthesis. Regarding multiple properties, the idea remains thesame: Analyzing the quotient MDP with respect to multiple properties yieldsmultiple probability bounds. After initiating a CEGIS-loop and obtaining anunsatisfiable realization, we can construct a separate conflict for each unsatisfiedproperty, while using the corresponding probability bound to enhance the CEgeneration process. Optimal synthesis is handled similarly to feasibility, but, afterobtaining a satisfiable solution, we update the optimizing property to exclude thissolution: e.g., for maximal synthesis this translates to increasing the threshold ofthe maximizing property. Having exhausted the search space of family members,the last obtained solution is declared to be the optimal one. nductive Synthesis for Probabilistic Programs Reaches New Horizons 13model | K | |R D | MDP size avg. MC size
Grid
Maze
20 1M 9k 5.4k
DPM
16 43M 9.5k 2.2k model | K | |R D | MDP size avg. MC size
Pole
17 1.3M 6.6k 5.6k
Herman
Herman ∗ Table 1.
Summary of the benchmarks and their statistics
Implementation.
We implemented the hybrid oracle on top of the probabilisticmodel checker Storm [18]. While the high-performance parts were implementedin C++, we used a python API to flexibly construct the overall synthesis loop.For SMT solving, we used Z3 [29]. The tool chain takes a PRISM [27] or JANI [6]sketch and a set of temporal properties, and returns a satisfying realization, ifsuch exists, or outputs that such realization does not exist. The implementationin the form of an artefact is available at https://zenodo.org/record/4422543.
Set-up.
We compare the adaptive oracle-guided synthesis with two state-of-the-artsynthesis methods: program-level CEGIS [9] using a MaxSat CE generation [16,41]and AR [10]. These use the same architecture and data structures from Storm.All experiments are run on an Ubuntu 19.04 machine with Intel i5-8300H (4cores at 2.3 GHz) and using up to 8 GB RAM, with all the algorithms beingexecuted on a single thread. The benchmarks consists of five different models,see Table 1, from various domains that were used in [9,10]. As opposed to thebenchmark considered in [9,10], we use larger variants of
Grid and
Herman tobetter demonstrate differences in the performance of individual methods.To investigate the scalability of the methods, we consider a new variant of the
Herman model, that allows us to scale the number of randomization strategiesand thus the family size. In particular, we will compare performance on twoinstances of different sizes: small Herman ∗ (5k members) and large Herman ∗ (3.1M members, other statistics are reported in Table 1).To reason about the pruning efficiency of different synthesis methods, wewant to avoid feasible synthesis problems, where the order of family explorationcan lead to inconsistent performance. Instead, we will primarily focus on non-feasible problems, where all realizations need to be explored in order to proveunsatisfiability. The experimental evaluation is presented in three parts. (1) Weevaluate the novel CE construction method and compare it with the MaxSat-basedoracle from [9]. (2) We compare the hybrid synthesis loop with the two baselinesAR and CEGIS. (3) We consider novel hard synthesis instances (multi-propertysynthesis, finding optimal programs) on instances of the model Herman ∗ . Comparing CE construction methods
We consider the quality of the CEs and their generation time . In particular, we want to investigate (1) whether usingCEs-modulo-families yields better CEes, (2) how the quality of CEs from the smartoracle compares to the MaxSat-based oracle, and how their time consumptioncompares. As a measure of quality of a CE, the average number of its relevantparameters w.r.t. the total number of its parameters is taken. That is, smaller
Grid ∗ Maze ∗ DPM ∗ Pole - 0.87 (0.062) 0.16 - - 309 12 (3, 5) ∗ - 0.54 (0.041) 0.29 - - 615 23 (80, 61) Herman - 0.91 (0.011) 0.50 - - 171 86 (24, 1) ∗ - 0.88 (0.016) 0.87 - - 643 269 (485, 13) CE quality for different methods and performance of three synthesis methods.For each model/property, we report results for two different thresholds where thesymbol ‘ ∗ ’ marks the one closer to the feasibility threshold, representing the moredifficult synthesis problem. Symbol ‘-’ marks a two-hour timeout. CE quality : Thepresented numbers give the CE quality (i.e., the smaller, the better). The numbers inparentheses represent the average run-time of constructing one CE in seconds (run-timesfor constructing CE using non-trivial bounds are similar as for trivial ones and are thusnot reported).
Performance : for each method, we report the number of iterations (forthe hybrid method, the reported values are iterations of the CEGIS and AR oracle,respectively) and the run-time in seconds. ratios imply better CEs. To measure the influence of using CEs-modulo-families,two types of bounds are used: (i) trivial bounds (i.e., γ = for safety and γ = for liveness properties), and (ii) non-trivial bounds corresponding to the entirefamily R D representing the most conservative estimate. The results are reportedin (the left part of) Table 2. In the next subsection, we investigate this samebenchmark from the point of view of the performance of the synthesis methods,which also shows the immediate effect of the new CE generation strategy.The first observation is that using non-trivial bounds (as opposed to trivialones) for the state expansion approach can drastically decrease the conflictsize. It turns out that the CEs obtained using the greedy approach are mostlylarger than those obtained with the MaxSat method. However (see Grid ), evenfor trivial bounds, we may obtain smaller CEs than for MaxSat: computinga minimal-command CE does not necessarily induce an optimal conflict. Onthe other hand, comparing the run-times in the parentheses, one can see thatcomputing CEs via the greedy state expansion is orders of magnitude faster thancomputing command-optimal ones using MaxSat. It is good to realize that thegreedy method makes at most | K | model-checking queries to compute CEs, whilethe MaxSat method may make exponentially many such queries. Overall, thegreedy method using the non-trivial bounds is able to obtain CEs of comparablequality as the MaxSat method, while being orders of magnitude faster. nductive Synthesis for Probabilistic Programs Reaches New Horizons 15 Performance comparison with AR/CEGIS
We compare the hybrid syn-thesis loop from Sect. 5 with two state-of-the-art baselines: CEGIS and AR. Theresults are displayed in (the right half of) Table 2.
In all 10 cases, the hybridmethod outperforms the baselines. It is up to an order of magnitude faster .Let us discuss the performance of the hybrid method. We classify benchmarksalong two dimensions: (1) the performance of CEGIS and (2) the performance ofAR. Based on the empirical performance, we classify (
Grid ) as good-for-CEGIS(and not for AR),
Maze , Pole and
DPM as good-for-AR (and not for CEGIS),and
Herman as hard (for both). Roughly, AR works well when the quotientMDP does not blow up and its analysis is precise due to consistent schedulers,i.e., when the parameter dependencies are not crucial for a precise analysis.CEGIS performs well when the CEs are small and fast to compute. On the otherhand, synthesis problems for which neither pure CEGIS nor pure AR are able toeffectively reason about non-trivial subfamilies, inherently profit from a hybridmethod. The main point we want to discuss is how the hybrid method reinforcesthe strengths of both methods, rather than their weaknesses .In the hybrid method, there are two factors that determine the efficiency:(i) how fast do we get bounds on the reachability probability that are tight enough toenable construction of good counterexamples? and (ii) how good are the constructedcounterexamples?
The former factor is attributed to the proposed adaptive scheme(see Alg. 2), where the method will prefer AR-like analysis and continue refinementuntil the computed bounds allow construction of small counterexamples. Thelatter is reflected above. Let us now discuss how these two aspects are reflectedin the benchmarks.In good-for-CEGIS benchmarks like
Grid , after analyzing a quotient MDPfor the whole family, the hybrid method mostly profits from better CEs yieldingbetter bounds, thus outperforming CEGIS. Indeed, the CEs are found so fastthat the bottleneck is no longer their generation. This also explains why thespeedup is not immediately translated to the speedup on the overall synthesisloop. In the good-for-AR benchmark
DPM , the hybrid method provides only aminor improvement as it has to perform a large number of AR-iterations beforethe novel CE-based pruning can be effectively used. This can be considered as theworst-case scenario for the hybrid method. On other good-for-AR benchmarkslike
Maze and
Pole , the good performance on AR allows to quickly obtain tightbounds which can then be exploited by CEGIS. Finally, in hard models like
Herman , abstraction-refinement is very expensive, but even the bounds from thefirst round yield bounds that, as opposed to the trivial bounds, now enable goodCEs: CEGIS can keep using these bounds to quickly prune the state space.
More complicated synthesis problems
Our new approach can push thelimits of synthesis benchmarks significantly. We illustrate this by considering anew variant of the
Herman model,
Herman ∗ , and a property imposing an upperbound on the expected number of rounds until stabilization. We put this boundjust below the optimal (i.e., the minimal) value, yielding a hard non-feasibleproblem. The synthesis results are summarized in Table 3. As CEGIS performspoorly on Herman , it is excluded here. two properties 97 38s (274, 1) optimality 531 150s (571, 7) synthesis AR Hybridproblem iters time iters timefeasibility 69k 47h (14280, 2) optimality 83k 55h (16197, 3) The impact of scaling the family size (of the
Herman ∗ model) and handlingmore complex synthesis problems. The left part shows the results for the smaller variant(5k members), the right part for the larger one (3.1M members). First, we investigate on small Herman ∗ how the methods can handle thesynthesis for multi-property specifications. We add one feasible property to the(still non-feasible) specification (row ‘two properties’). While including moreproperties typically slows down the AR computation, the performance of thehybrid method is not affected as the corresponding overhead is mitigated byadditional pruning opportunities. Second, we consider optimal synthesis for theproperty as used in the feasibility synthesis. The hybrid method requires onlya minor overhead to find an optimal solution compared to checking feasibility.This overhead is significantly larger for AR.Next, we consider larger Herman ∗ model having significantly more randomiza-tion strategies (3.1M members) that include solutions leading to a considerablyfaster stabilization. This model is out of reach for existing synthesis approaches:one-by-one enumeration takes more than 27 hours and the AR performs evenworse—solving the feasibility and optimality problems requires 47 and 55 hours,respectively. On the other hand, the proposed hybrid method is able to solvethese problems within minutes. Finally, we consider a relaxed variant of optimalsynthesis (5%-optimality) guaranteeing that the found solution is up to 5% worsethan the optimal. Relaxing the optimally criterion speeds up the hybrid synthesismethod by about a factor three.These experiments clearly demonstrate that scaling up the synthesis problemseveral orders of magnitude renders existing synthesis methods infeasible: theyneed tens of hours to solve the synthesis problems. Meanwhile, the hybrid methodtackles these difficult synthesis problems without significant penalty and is capableof producing a solution within minutes. We present a novel method for the automated synthesis of probabilistic programs.Pairing the counterexample-guided inductive synthesis with the deductive oracleusing an MDP abstraction, we develop a synthesis technique enabling fasterconstruction of smaller counterexamples. Evaluating the method on case studiesfrom different domains, we demonstrate that the novel CE construction and theadaptive strategy lead to a significant acceleration of the synthesis process. Theproposed method is able to reduce the run-time for challenging problems fromdays to minutes. In our future work, we plan to investigate counterexamples onthe quotient MDPs and improve the abstraction refinement strategy. nductive Synthesis for Probabilistic Programs Reaches New Horizons 17
References