[PDF] Inductive Synthesis for Probabilistic Programs Reaches New Horizons

Abstract

This paper presents a novel method for the automated synthesis of probabilistic programs. The starting point is a program sketch representing a finite family of finite-state Markov chains with related but distinct topologies, and a PCTL specification. The method builds on a novel inductive oracle that greedily generates counter-examples (CEs) for violating programs and uses them to prune the family. These CEs leverage the semantics of the family in the form of bounds on its best- and worst-case behaviour provided by a deductive oracle using an MDP abstraction. The method further monitors the performance of the synthesis and adaptively switches between the inductive and deductive reasoning. Our experiments demonstrate that the novel CE construction provides a significantly faster and more effective pruning strategy leading to acceleration of the synthesis process on a wide range of benchmarks. For challenging problems, such as the synthesis of decentralized partially-observable controllers, we reduce the run-time from a day to minutes.

Full PDF

CC o n s i s t e n t * C o m p l e t e * W e l l D o c u m e n t e d * E a s y t o R e u s e * * E v a l u a t e d * T A C A S * A r t i f a c t * A E C Inductive Synthesis for Probabilistic ProgramsReaches New Horizons (cid:63)

Roman Andriushchenko , Milan ˇCeˇska ( (cid:66) ) ,Sebastian Junges , and Joost-Pieter Katoen Brno University of Technology, Brno, Czech Republic [email protected] University of California, Berkeley, USA RWTH Aachen University, Aachen, Germany

Abstract.

This paper presents a novel method for the automated syn-thesis of probabilistic programs. The starting point is a program sketchrepresenting a ﬁnite family of ﬁnite-state Markov chains with related butdistinct topologies, and a reachability speciﬁcation. The method builds ona novel inductive oracle that greedily generates counter-examples (CEs)for violating programs and uses them to prune the family. These CEsleverage the semantics of the family in the form of bounds on its best-and worst-case behaviour provided by a deductive oracle using an MDPabstraction. The method further monitors the performance of the synthe-sis and adaptively switches between inductive and deductive reasoning.Our experiments demonstrate that the novel CE construction providesa signiﬁcantly faster and more eﬀective pruning strategy leading to anaccelerated synthesis process on a wide range of benchmarks. For challeng-ing problems, such as the synthesis of decentralized partially-observablecontrollers, we reduce the run-time from a day to minutes.

Background and motivation.

Controller synthesis for Markov decision processes(MDPs [35]) and temporal logic constraints is a well-understood and tractableproblem, with a plethora of mature tools providing eﬃcient solving capabilities.However, the applicability of these controllers to a variety of systems is limited:Systems may be decentralized, controllers may not be able to observe the completesystem state, cost constraints may apply, and so forth. Adequate operationalmodels for these systems exist in the form of decentralized partially-observableMDPs (DEC-POMDPs [33]). The controller synthesis problem for these modelsis undecidable [30], and tool support (for veriﬁcation tasks) is scarce.This paper takes a diﬀerent approach: the controller together with the en-vironment can be modelled as probabilistic program sketches where “holes” inthe probabilistic program model choices that the controller may make. Concep-tually, the controllers of the DEC-POMDP are described by a user-deﬁned ﬁnite (cid:63)

This work has been partially supported by the Czech Science Foundation grantGJ20-02328Y and the ERC AdG Grant 787914 FRAPPANT, the NSF grants 1545126(VeHICaL) and 1646208, by the DARPA Assured Autonomy program, by BerkeleyDeep Drive, and by Toyota under the iCyPhy center. a r X i v : . [ c s . L O ] J a n R. Andriushchenko et al. family M of Markov chains. The synthesis problem that we consider is to ﬁnda Markov chain M (i.e., a probabilistic program) in the family M , such that M | = ϕ , where ϕ is the speciﬁcation. To allow eﬃcient algorithms, the family musthave some structure. In particular, in our setting, the family is parameterizedby a set of discrete parameters K ; an assignment K → V of these parameterswith concrete values V from its associated domain yields a family member, i.e.,a Markov chain (MC). Such a parameterization is naturally obtained from theprobabilistic program sketch, where some constants (or program parts) can beleft open. The search for a family member can thus be considered as the searchfor a hole-assignment. This approach ﬁts within the realm of syntax-guidedsynthesis [2]. Motivating example. Herman’s protocol [24] is a well-studied randomized dis-tributed algorithm aimed to obtain fast stabilization on average. In [26], afamily M of MCs is used to model diﬀerent protocol instances. They consideredeach instance separately, and found which of the controllers for Herman’s protocolperforms best. Let us consider the protocol in a bit more detail: It considersself-stabilization of a unidirectional ring of network stations where all stationshave to behave similarly—an anonymous network. Each station stores a single bit,and can read the internal bit of one (say left) neighbour. To achieve stabilization,a station for which the two legible bits coincide updates its own bit based onthe outcome of a coin ﬂip. The challenge is to select a controller that ﬂips thiscoin with an optimal bias, i.e., minimizing the expected time until stabilization.In a setting where the probabilities range over 0 . , . , . . . , .

9, this results inanalyzing nine diﬀerent MCs. Does the expected time until stabilization reduceif the controllers are additionally allowed to have a single bit of memory? Inevery step, there are 9 · · · ,

368 models. Eventually, analyzing all individual MCs is infeasible.

Oracle-guided synthesis.

To tackle the synthesis problem, we introduce an oracle-guided inductive synthesis approach [25,39]. A learner selects a family member andpasses it to the oracle. The oracle answers whether the family member satisﬁes ϕ ,and crucially, gives additional information in case this is not the case. Inspiredby [9], if the family member violates the speciﬁcation ϕ , our oracle returns a set K (cid:48) of parameters such that all family members obtained by changing only thevalues assigned to K (cid:48) violate ϕ . We argue that such an oracle must (1) inducelittle overhead in providing K (cid:48) , (2) be aware of the existence of parameters inthe family, and (3) have (resemblance of) awareness about the semantics of theparameters and their values. Oracles.

With these requirements in mind, we construct a counterexample (CE)-based oracle from scratch. We do so by carefully exploiting existing methods.We construct critical subsystems as CEs [1]. Critical subsystems are parts of nductive Synthesis for Probabilistic Programs Reaches New Horizons 3 the MC that suﬃce to refute the speciﬁcation. If a hole is absent in a CE,its value is irrelevant. To avoid the cost of ﬁnding optimal CEs—an NP-hardproblem [19]—we consider greedy CEs that are similar to [9]. However, our greedyCEs are aware of the parameters, and try to limit the occurrence of parametersin the CE. Finally, to provide awareness of the semantics of parameter values,we provide lower and upper bounds on all states: Their diﬀerence indicates howmuch varying the value at a hole may change the overall reachability probability.These bounds are eﬃciently computed by another oracle. This oracle analyses aquotient MDP obtained by employing an abstraction method that is part of theabstraction-reﬁnement loop in [10].

A hybrid variant.

The two oracles are signiﬁcantly diﬀerent. Abstraction reﬁne-ment is deductive : it argues about single family members by considering (anaggregation of) all family members. The critical subsystem oracle is inductive :by examining a single family member, it infers statements about other familymembers. This suggests a middle ground: a hybrid strategy monitors the per-formance of the two oracles during the synthesis and suggests their best usage.More precisely, the hybrid strategy integrates the counterexample-based oracleinto the abstraction-reﬁnement loop.

Major results.

We present a novel and dedicated oracle deployed in an eﬃcacioussynthesis loop. We use model-checking results on an abstraction to tailor smallerCEs. Our greedy and family-aware CE construction is substantially faster thanthe use of optimal CEs. Together, these two improvements yield CEs that are onpar with optimal CEs, but are found much faster. The integration of multipleabstraction-reﬁnement steps yields a superior performance:x We compare ourperformance with the abstraction-reﬁnement loop from [10] using benchmarksfrom [10]. Benchmarks can be classiﬁed along two dimensions: ( A ) Benchmarkswith a structure good for CE-generation. ( B ) Benchmarks with a structure goodfor abstraction-reﬁnement. A-benchmarks are a natural strength of our noveloracle. Our simple, eﬃcient hybrid strategy signiﬁcantly outperforms the state-of-the-art on A -benchmarks, while it only yields limited overhead for B -benchmarks.Most importantly, the novel hybrid strategy can solve benchmarks that areout of reach for pure abstraction-reﬁnement or pure CE-based reasoning. Inparticular, our hybrid method is able to synthesize the optimal Herman protocolwith memory—the synthesis time on a design space with 3.1 millions of candidateprograms reduces from a day to minutes. Related work

The synthesis problems for parametric probabilistic systems canbe divided into the following two categories.

Topology synthesis, akin to the problem considered in this paper, assumes a ﬁniteset of parameters aﬀecting the MC topology. Finding an instantiation satisfyinga reachability property is NP-complete in the number of parameters [12], andcan naively be solved by analyzing all individual family members. An alternativeis to model the MC family by an MDP and resort to standard MDP model-checking algorithms. Tools such as ProFeat [13] or QFLan [40] take this approach

R. Andriushchenko et al. to quantitatively analyze alternative designs of software product lines [21,28].These methods are limited to small families. This motivated (1) abstraction-reﬁnement over the MDP representation [10], and (2) counterexample-guidedinductive synthesis (CEGIS) for MCs [9], mentioned earlier. The alternativeproblem of sketching for probabilistic programs that ﬁt given data is studied,e.g., in [32,38].

Parameter synthesis considers models with uncertain parameters associated totransition probabilities, and analyses how the system behaviour depends onthe parameter values. The most promising techniques are based on parameterlifting that treats identical parameters in diﬀerent transitions independently [8,36]and has been implemented in the state-of-the-art probabilistic model checkersStorm [18] and PRISM [27]. An alternative approach based on building rationalfunctions for the satisfaction probability has been proposed in [15] and furtherimproved in [22,17,4]. This approach has been also applied to diﬀerent problemssuch as model repair [5,34,11].Both synthesis problems can be also attacked by search-based techniques thatdo not ensure an exhaustive exploration of the parameter space. These includeevolutionary techniques [23,31] and genetic algorithms [20]. Combinations withparameter synthesis have been used [7] to synthesize robust systems.

We formalize the essential ingredients and the problem statement. See [3] formore material.

Sets of Markov chains. A (discrete) distribution over a ﬁnite set X is a function µ : S → [0 ,

1] s.t. (cid:80) x µ ( x ) = 1. The set Distr ( X ) contains all distributions over X . The support of µ ∈ Distr ( X ) is supp( µ ) = { x ∈ X | µ ( x ) > } . Deﬁnition 1 (MC). A Markov chain (MC) is a tuple D = ( S, s , P ) , where S is a ﬁnite set of states , s ∈ S is an initial state , and P : S → Distr ( S ) isa transition probability function . We write P ( s, t ) to denote P ( s )( t ) . The state s is absorbing if P ( s, s ) = 1 . Let K denote a ﬁnite set of discrete parameters with ﬁnite domain V k . Forbrevity, we often assume that all domains are the same, and omit the subscript k . A realization r maps parameters to values in their domain, i.e., r : K → V .Let R D denote the set of all realizations of a set D of MCs. A K -parameterizedset of MCs D ( K ) contains the MCs D r , for every r ∈ R D . In Sect. 3, we give anoperational model for such sets. In particular, realizations will ﬁx the targets oftransitions. In our experiments, we describe these sets using the PRISM modellinglanguage where parameters are described by undeﬁned integer values. Properties and speciﬁcations.

For simplicity, we consider (unbounded) reach-ability properties . For a set T ⊆ S of target states , let P [ D, s | = ♦ T ] denote Our implementation also supports expected reachability rewards.nductive Synthesis for Probabilistic Programs Reaches New Horizons 5 the probability in MC D to eventually reach some state in T when startingin the state s ∈ S . A property ϕ ≡ P (cid:46)(cid:47)λ [ ♦ T ] with λ ∈ [0 ,

1] and (cid:46)(cid:47) ∈ {≤ , ≥} expresses that the probability to reach T does relate to λ according to (cid:46)(cid:47) . If (cid:46)(cid:47) = ≤ , then ϕ is a safety property; otherwise, it is a liveness property. Formally,state s in MC D satisﬁes ϕ if P [ D, s | = ♦ T ] ≥ λ . The MC D satisﬁes ϕ if theabove holds for its initial state. A speciﬁcation is a set of properties Φ = { ϕ i } i ∈ I ,and D | = Φ if ∀ i ∈ I : D | = ϕ i . Problem statement.

The key problem statement in this paper is feasibility :Given a parameterized set of Markov chains D ( K ) over parameters K anda speciﬁcation Φ , ﬁnd a realization r : K → V such that D r | = Φ .When D is clear from the context, we often write r | = Φ to denote D r | = Φ .We additionally consider the optimizing variant of the synthesis problem.The maximal synthesis problem asks: given a maximizing property ϕ max ≡ P (cid:46)(cid:47)λ [ ♦ T ], identify r ∗ ∈ arg max r ∈R D { P [ D r | = ♦ T ] | D r | = Φ } provided it exists.The minimal synthesis problem is deﬁned analogously.As the state space S , the set K of parameters, and their domains are all ﬁnite,the above synthesis problems are decidable. One possible solution, called the one-by-one approach [14], considers each realization r ∈ R D . The state-space andparameter-space explosion renders this approach unusable for large problems,necessitating the usage of advanced techniques that exploit the family structure. In this section, we recap a baseline for a counterexample-guided inductive syn-thesis (CEGIS) loop, as put forward in [9]. In particular, we ﬁrst instantiate anoracle-guided synthesis method, discuss an operational model for families, givingstructure to the parameterized set of Markov chains, and ﬁnally detail the usageof CEs to create an oracle. Learner Oracle

R D , Φ r ∈ R r ∈ R (cid:48) ⊆ R , R (cid:48) all violate Φ r | = Φ no r | = Φ Fig. 1.

Oracle-guided synthesis

Consider Fig. 1. A learner takes aset R of realizations, and has to ﬁnd arealization D r satisfying the speciﬁca-tion Φ . The learner maintains (a sym-bolic representation of) a set Q ⊆ R of realizations that need to be checked.It iteratively asks the oracle whethera particular r ∈ Q is a solution. If it isa solution, the oracle reports success.Otherwise, the oracle returns a set R (cid:48) containing r and potentially more realiza-tions all violating Φ . The learner then prunes R (cid:48) from Q . In Section 4, we focuson creating an eﬃcient oracle that computes a set R (cid:48) (with r ∈ R (cid:48) ) of realizationsthat are all violating Φ . In Section 5, we provide a more advanced frameworkthat extends this method. The remainder of this section lays the groundwork forthese sections. R. Andriushchenko et al.

Families of Markov chains

To avoid the need to iterate over all realizations,an eﬃcient oracle exploits some structure of the family. In this paper, we focus onsets of Markov chains having diﬀerent topologies. We explain our concepts usingthe operational model of families given in [10]. Our implementation supports(more expressive) PRISM programs with undeﬁned integer constants.

Deﬁnition 2 (Family of MCs). A family of MCs is a tuple D = ( S, s , K, B ) with S and s as before, K is a ﬁnite set of parameters with domains V k ⊆ S foreach k ∈ K , and B : S → Distr ( K ) is a family of transition probability functions. Function B of a family D of MCs maps each state to a distribution over parame-ters K . In the context of the synthesis of probabilistic models, these parametersrepresent unknown options or features of a system under design. Realizations arenow deﬁned as follows. Deﬁnition 3 (Realization). A realization of a family D = ( S, s , K, B ) of MCsis a function r : K → S s.t. r ( k ) ∈ V k , for all k ∈ K . We say that realization r induces MC D r = ( S, s , B r ) iﬀ B r ( s, s (cid:48) ) = (cid:80) k ∈ K,r ( k )= s (cid:48) B ( s )( k ) for any pair ofstates s, s (cid:48) ∈ S . The set of all realizations of D is denoted as R D . The set R D = (cid:81) k ∈ K V k of all possible realizations is exponential in | K | . Counterexample-guided oracles

We ﬁrst consider the feasibility synthesisfor a single-property speciﬁcation and later, cf. Remark 1, generalize this tomultiple properties and to optimal synthesis. The notion of counterexamples isat the heart of the oracle from [9] and Sect. 4.If an MC D (cid:54)| = ϕ , a counterexample (CE) based on a critical subsystem canserve as diagnostic information about the source of the failure. We consider thefollowing CE, motivated by the notion of critical subsystem in [37]. Deﬁnition 4 (Counterexample).

Let D = ( S, s , P ) be an MC with s ⊥ (cid:54)∈ S .The sub-MC of D induced by C ⊆ S is the MC D ↓ C = ( S ∪ { s ⊥ } , s , P (cid:48) ) , wherethe transition probability function P (cid:48) is deﬁned by: P (cid:48) ( s ) = (cid:40) P ( s ) if s ∈ C, [ s ⊥ (cid:55)→ otherwise . The set C and the sub-MC D ↓ C are called a counterexample (CE) for the property P ≤ λ [ ♦ T ] on MC D , if D ↓ C (cid:54)| = P ≤ λ [ ♦ ( T ∩ ( C ∪ { s } ))] . Let D r be an MC violating the speciﬁcation ϕ . To compute other realizationsviolating ϕ , the oracle computes a critical subsystem D r ↓ C , which is then usedto deduce a so-called conﬂict for D r and ϕ . Deﬁnition 5 (Conﬂict).

For family of MCs D = ( S, s , K, B ) and C ⊆ S , theset K C of relevant parameters (called conﬂict ) is given by (cid:83) s ∈ C supp( B ( s )) . nductive Synthesis for Probabilistic Programs Reaches New Horizons 7 Fig. 2.

Counterexamples for smaller conﬂicts.

It is straightforward to compute a set of violating realizations from a conﬂict. A generalization of realization r induced by the set K C ⊆ K of relevant parametersis the set r ↑ K C = { r (cid:48) ∈ R | ∀ k ∈ K C : r ( k ) = r (cid:48) ( k ) } . We often use the term conﬂict to refer to its generalization. The size of a conﬂict, i.e., the number | K C | of relevant parameters K C is crucial. Small conﬂicts potentially lead togeneralizing r to larger subfamilies r ↑ K C . It is thus important that the CEscontain as few parameterized transitions as possible. The size of a CE in termsof the number of states is not of interest. Furthermore, the overhead of providingCEs should be bounded from below by the payoﬀ: Finding a large generalizationmay take some time, but small generalizations should be returned quickly. TheCE-based oracle in [9] uses an oﬀ-the-shelf CE procedure [16,41], and mostlydoes not provide small CEs. This section develops an oracle based on CEs, tailored for the use in an oracle-guided inductive synthesis loop described in Sect. 3. Its main features are: – a fast greedy approach to compute CEs that provide small conﬂicts: Weachieve this by taking into account the position of the parameters. – awareness about the semantics of parameters by using model-checking resultsfrom an abstraction of the family.Before going into details, we provide some illustrative examples. A motivating example

First, we illustrate what it means to take CEs thatlead to small conﬂicts. Consider Fig. 2, with a family member D r (left), wherethe superscript of a state identiﬁer s i denotes parameters relevant to s i . Considerthe safety property ϕ ≡ P ≤ . [ ♦ { t } ]. Clearly, D r (cid:54)| = ϕ , and we can constructtwo CEs: C = { s , s , t } (center) and C = { s , s , s , t } (right) with conﬂicts K C = { X, Y } and K C = { X } , respectively. It illustrates that a smaller CEdoes not necessarily induce a smaller conﬂict.We now illustrate awareness of the semantics of parameters. Consider thefamily D = ( S, s , K (cid:48) , B ), where S = { s , s , s , t, f } , the parameters are K (cid:48) = { X, Y, T (cid:48) , F (cid:48) } with domains V X = { s , s } , V Y = { t, f } , V T (cid:48) = { t } , V F (cid:48) = { f } ,and a family B of transition probability functions deﬁned in Fig. 3 (left). As the R. Andriushchenko et al. B ( s ) = [ X (cid:55)→ , B ( s ) = [ T (cid:48) (cid:55)→ . , Y (cid:55)→ . , F (cid:48) (cid:55)→ . , B ( s ) = [ T (cid:48) (cid:55)→ . , Y (cid:55)→ . , F (cid:48) (cid:55)→ . , B ( t ) = [ T (cid:48) (cid:55)→ , B ( f ) = [ F (cid:48) (cid:55)→ Fig. 3.

A family D of four Markov chains (unreachable states are grayed out). parameters T (cid:48) and F (cid:48) each can take only one value, we consider K = { X, Y } as the set of parameters. There are | V X | × | V Y | = 4 family members, depictedin Fig. 3(right). For conciseness, we omit some of the transition probabilities(recall that transition probabilities sum to one). Only realization r satisﬁes thesafety property ϕ ≡ P ≤ . [ ♦ { t } ]. CEGIS [9] illustrated : Consider running CEGIS, and assume the oracle getsrealization r ﬁrst. A model checker reveals P [ D r , s | = ♦ T ] = 0 . > .

3. TheCE for D r and ϕ contains the (only) path to the target: s → s → t havingprobability 0 . > .

3. The corresponding CE C = { s , s , t } induces the conﬂict K C = { X, Y } . None of the parameters is generalized. The same argument appliesto any subsequent realization: the constructed CEs do not allow for generalization,the oracle returns only the passed realization, and the learner keeps iteratinguntil accidentally guessing r . Can we do better?

To answer this, consider CE generation as a game: ThePruner creates a critical subsystem C . The Adversary wins if it ﬁnds a MCsatisfying ϕ containing C , thus refuting that C is a counterexample. In oursetting, we change the game: The Adversary must select a family member ratherthan an arbitrary MC. Analogously, oﬀ-the-shelf CE generators construct acritical subsystem C that for every possible extension of C is a CE. Theseare CEs without context . In our game, the Adversary may not extend the MCarbitrarily, but must choose a family member. These are

CEs modulo a family . Back to the example:

Observe that for a CE for D r , we could omit states t and s from the set C of critical states: we know for sure that, once D r takestransition ( s , s ), it will reach target state t with probability at least 0 .

6. Thisexceeds the threshold 0 .

3, regardless of the value of the parameter Y . Hence, forfamily D , the set C (cid:48) = { s } is a critical subsystem. The immediate advantage isthat this set induces conﬂict K C (cid:48) = { X } (parameter Y has been generalized).This enables us to reject all realizations from the set r ↑ K C (cid:48) = { r , r } . It is‘easier’ to construct a CE for a (sub)family than for arbitrary MCs . More generally,a successful oracle needs to have access to useful bounds, and eﬀectively integratethem in the CE generation. nductive Synthesis for Probabilistic Programs Reaches New Horizons 9

Counterexample construction

We develop an algorithm using bounds onreachability probabilities, similar to the bounds used above. Let us assume that forsome set of realizations R and for every state s , we have bounds lb R ( s ) , ub R ( s ),such that for every r ∈ R we have lb R ( s ) ≤ P [ D r , s | = ♦ T ] ≤ ub R ( s ). Suchbounds always exist (take 0 and 1). We see later how we compute these bounds.In what follows, we ﬁx r and denote D r = ( S, s , P ). Let us assume D r violatesa safety property ϕ ≡ P ≤ λ [ ♦ T ]. The following deﬁnition is central: Deﬁnition 6 (Rerouting).

Let MC D = ( S, s , P ) with s (cid:62) , s ⊥ (cid:54)∈ S , C ⊆ S a set of expanded states and γ : S \ C → [0 , a rerouting vector . The rerouting of MC D w.r.t. C and γ is the MC D ↓ C [ γ ] = ( S ∪ { s ⊥ , s (cid:62) } , s , P C γ ) with: P C γ ( s ) =  P ( s ) if s ∈ C, [ s (cid:62) (cid:55)→ γ ( s ) , s ⊥ (cid:55)→ (1 − γ ( s ))] if s ∈ S \ C, [ s (cid:55)→ if s ∈ { s (cid:62) , s ⊥ } . Essentially, D ↓ C [ γ ] extends the MC D with additional sink states s (cid:62) and s ⊥ and replaces all outgoing transitions of any non-expanded state s ∈ S \ C bya transition leading to s (cid:62) (with probability γ ( s )) and a complementary one to s ⊥ .We consider s (cid:62) to be the new target and let ϕ (cid:48) denote the updated property. Thetransition s γ ( s ) −−−→ s (cid:62) may be considered a ‘shortcut’ that by-passes successors of s and leads straight to target s (cid:62) with probability γ ( s ). To ensure that D ↓ C [ γ ]is a CE, the value γ ( s ) must be a lower bound on the reachability probabilityfrom s in D . When constructing a CE for a singular MC, we pick γ = , whereaswhen this MC is induced by a realization r ∈ R , we can safely pick γ = lb R . TheCE will be valid for every r (cid:48) ∈ R . It is a CE-modulo- R .Algorithmically, we employ a state-exploration approach and therefore startwith C (0) = ∅ , i.e., all states are initially rerouted. If this is a CE, we aredone. Otherwise, if the rerouting D ↓ C (0) [ γ ] satisﬁes ϕ (cid:48) , then we ‘expand’ somestates to obtain a CE. Naturally, we must expand reachable states to change thesatisfaction of ϕ . By expanding some state s ∈ S , we abandon the abstractionassociated with the shortcut s γ ( s ) −−−→ s (cid:62) and replace it with concrete behavior thatwas inherent to state s in MC D . Expanding a state cannot decrease the inducedreachability probability as lb R is a valid lower bound. This gradual expansionof the reachable state space continues until for some C ⊆ S the correspondingrerouting D ↓ C [ γ ] violates ϕ (cid:48) . This gradual expansion process terminates as D ↓ S [ γ ] ≡ D and our assumption is D (cid:54)| = ϕ . We show this process on an example. Example 1.

Reconsider D in Fig. 3 with ϕ ≡ P ≤ . [ ♦ { t } ]. Using the methodoutlined below we get: lb R = [ s (cid:55)→ . , s (cid:55)→ . , s (cid:55)→ . , t (cid:55)→ , f (cid:55)→ { s , s , t } . Consider the gradual reroutingapproach: We set γ = lb R , C (0) = ∅ and have D (0) := D r ↓ C (0) [ γ ], see Fig. 4(a).Verifying this MC against ϕ (cid:48) = P ≤ . [ ♦ T ∪{ s (cid:62) } ] yields P [ D (0) , s | = ♦ T ∪{ s (cid:62) } ] = γ ( s ) = 0 . ≤ .

3, i.e., the set C (0) is not a CE. We now expand the initial state,i.e., C (1) = { s } and let D (1) := D r ↓ C (1) [ γ ], see Fig. 4(b). Verifying D (1) yields P [ D (1) , s | = ♦ T ∪ { s (cid:62) } ] = 1 · γ ( s ) = 0 . > .

3. Thus, the set C (1) is critical Fig. 4.

Finding a CE to D r and ϕ from Fig. 3 using the rerouting vector γ = lb R . Algorithm 1:

Counterexample construction based on rerouting.

Input :

An MC D r a property ϕ ≡ P (cid:46)(cid:47)λ [ ♦ T ] s.t. D r (cid:54)| = ϕ , a rerouting vector γ . Output :

A conﬂict K for D r and ϕ . i ← K ( i ) ← ∅ while true do C ( i ) , H ( i ) ← reachableViaHoles( D r , K ( i ) ) D ( i ) ← D r ↓ C ( i ) [ γ ] if P [ D ( i ) | = ♦ T ∪ { s (cid:62) } ] (cid:54) (cid:46)(cid:47) λ then return K ( i ) ; s ← chooseToExpand( H ( i ) , K ( i ) ) K ( i +1) = K ( i ) ∪ supp( B ( s )) i ← i + 1 end while and the corresponding conﬂict is K C (1) = supp( s ) = { X } . This is smaller thanthe naively computed conﬂict { X, Y } . Greedy state expansion strategy

Recall from Fig. 2 that for an MC D r with D r (cid:54)| = ϕ , multiple CEs may exist inducing diﬀerent conﬂicts. An eﬃcient expansionstrategy should yield a CE that induces a small amount of relevant parameters(to prune more family members) and this CE is preferably obtained by a smallnumber of model-checking queries. The method presented in Alg. 1 meets thesecriteria. The algorithm expands multiple states between subsequent model checks,while expanding only states that are associated with parameters that are relevant.In particular, in each iteration, we keep track of the set K ( i ) of relevant parametersoptimistically starting with K (0) = ∅ . We compute (see line 3) the set C ( i ) ofstates that are reachable from the initial state via states which are associated onlywith relevant parameters in K ( i ) , i.e., via states for which supp( B ( s )) ⊆ K ( i ) .Here, H ( i ) represents a state exploration ‘horizon’: the set of states reachablefrom C ( i ) but containing some (still) irrelevant parameters. We then constructthe corresponding rerouting D ↓ C ( i ) [ γ ] and check whether it is a CE. Otherwise,we greedily choose a state s from the horizon H ( i ) containing the least numberof irrelevant parameters and add these parameters to our conﬂict (see line 7). nductive Synthesis for Probabilistic Programs Reaches New Horizons 11Learner CE-OracleAbstr-Oracle R D , Φ D , Φ r ∈ R +bounds R (cid:48) ⊆ R violate Φ R (cid:48) ⊆ R bounds or R (cid:48) violates r | = Φ each r ∈ R (cid:48) , r | = Φ no r | = Φ Fig. 5.

Conceptual hybrid (dual-oracle) synthesis.

The resulting conﬂict may not be minimal, but is computed fast. Our algorithmapplies to probabilistic liveness properties too using γ = ub R . Computing bounds

We compute lb R and ub R using an abstraction [10]. Themethod considers some set R of realizations and computes the corresponding quotient Markov decision process (MDP) that over-approximates the behavior ofall MCs in the family R . Model checking this MDP yields an upper and a lowerbound of the induced probabilities for all states over all realizations in R . Thatis, Bound ( D , R ) computes lb R ∈ R S and ub R ∈ R S such that for each s ∈ S : lb R ( s ) ≤ min r ∈R P [ D r , s | = ♦ T ] ≤ max r ∈R P [ D r , s | = ♦ T ] ≤ ub R ( s ) . To allow for reﬁnement, two properties are crucial (with point-wise inequalities):1. lb R ≤ lb R (cid:48) ∧ ub R ≥ ub R (cid:48) for R (cid:48) ⊆ R and 2. lb { r } = ub { r } for r ∈ R . In [10], the abstraction and reﬁnement together deﬁne an abstraction-reﬁnementloop (AR) that addresses the feasibility problem. In the worst case, this loopanalyses 2 · |R| quotient MDPs, which (as of now) may be arbitrarily larger thanthe number of family members they represent.

We introduce an extended synthesis loop in which the abstraction-based reasoningis used to prune the family R , and to accelerate the CE-based oracle from Sect. 4.The intuitive idea is outlined in Fig. 5. Note that if the CE-based oracle is notexploited, we emulate AR (explained in computing bounds above), whereas ifthe abstraction oracle is not used, we emulate CEGIS (with the novel oracle).Let us motivate combining these oracles in a ﬂexible way. The naive versionoutlined in the previous section assumed a single abstraction step, and invokesCEGIS with the bounds obtained from that step. Evidently, the better (tighter)the bounds γ , the better the CEs. However, the abstraction-based bounds for R may be very loose. These bounds can be improved by splitting the set R andusing the bounds on the two sub-families. The idea is to run a limited number of Some care is required regarding loops, see [9].2 R. Andriushchenko et al.

Algorithm 2:

Hybrid (dual-oracle) synthesis.

Input :

A family D , a reachability property ϕ . Output :

Either a member r in D with r | = ϕ , or no such r exists in D R ← {R D } ; // each analysed (sub-)family also holds bounds δ CEGIS ← // time allocation factor for CEGIS while true do result, R (cid:48) , σ AR , t AR ← AR.run( R , ϕ ) if result. decided() then return result ; CEGIS.setTimeout( t AR · δ CEGIS ) result, σ CEGIS , R (cid:48)(cid:48) ← CEGIS . run( R (cid:48) , ϕ ) if result. decided() then return result ; δ CEGIS ← σ CEGIS /σ AR R ← R (cid:48)(cid:48) end while AR steps and then invoke CEGIS. Our experiments reveal that it can be crucialto be adaptive, i.e., the integrated method must be able to detect at run timewhen to switch.The proposed hybrid method switches between AR and CEGIS, where weallow for reﬁning during the AR phase and use the obtained reﬁned boundsduring CEGIS. Additionally, we estimate the eﬃciency σ (e.g., the number ofpruned MCs per time unit) of the two methods and allocate more time t to themethod with superior performance. That is, if we detect that CEGIS prunessub-families twice as fast as AR, we double the time in the next round forCEGIS. The resulting algorithm is summarized in Alg. 2. Recall that AR (atline 5) takes one family from R , either solves it or splits it and returns the setof undecided families R (cid:48) . In contrast, CEGIS processes multiple families from R (cid:48) until the timeout and then returns the set of undecided families R (cid:48)(cid:48) . Thisworkﬂow is motivated by the fact that one iteration of AR (i.e., the involvedMDP model-checking) is typically signiﬁcantly slower that one CEGIS iteration. Remark 1.

Although the developed framework for integrated synthesis has beendiscussed in the context of feasibility with respect to a single property ϕ , itcan be easily generalized to handle multiple -property speciﬁcations as well asto treat optimal synthesis. Regarding multiple properties, the idea remains thesame: Analyzing the quotient MDP with respect to multiple properties yieldsmultiple probability bounds. After initiating a CEGIS-loop and obtaining anunsatisﬁable realization, we can construct a separate conﬂict for each unsatisﬁedproperty, while using the corresponding probability bound to enhance the CEgeneration process. Optimal synthesis is handled similarly to feasibility, but, afterobtaining a satisﬁable solution, we update the optimizing property to exclude thissolution: e.g., for maximal synthesis this translates to increasing the threshold ofthe maximizing property. Having exhausted the search space of family members,the last obtained solution is declared to be the optimal one. nductive Synthesis for Probabilistic Programs Reaches New Horizons 13model | K | |R D | MDP size avg. MC size

Grid

Maze

20 1M 9k 5.4k

DPM

16 43M 9.5k 2.2k model | K | |R D | MDP size avg. MC size

Pole

17 1.3M 6.6k 5.6k

Herman

Herman ∗ Table 1.

Summary of the benchmarks and their statistics

Implementation.

We implemented the hybrid oracle on top of the probabilisticmodel checker Storm [18]. While the high-performance parts were implementedin C++, we used a python API to ﬂexibly construct the overall synthesis loop.For SMT solving, we used Z3 [29]. The tool chain takes a PRISM [27] or JANI [6]sketch and a set of temporal properties, and returns a satisfying realization, ifsuch exists, or outputs that such realization does not exist. The implementationin the form of an artefact is available at https://zenodo.org/record/4422543.

Set-up.

We compare the adaptive oracle-guided synthesis with two state-of-the-artsynthesis methods: program-level CEGIS [9] using a MaxSat CE generation [16,41]and AR [10]. These use the same architecture and data structures from Storm.All experiments are run on an Ubuntu 19.04 machine with Intel i5-8300H (4cores at 2.3 GHz) and using up to 8 GB RAM, with all the algorithms beingexecuted on a single thread. The benchmarks consists of ﬁve diﬀerent models,see Table 1, from various domains that were used in [9,10]. As opposed to thebenchmark considered in [9,10], we use larger variants of

Grid and

Herman tobetter demonstrate diﬀerences in the performance of individual methods.To investigate the scalability of the methods, we consider a new variant of the

Herman model, that allows us to scale the number of randomization strategiesand thus the family size. In particular, we will compare performance on twoinstances of diﬀerent sizes: small Herman ∗ (5k members) and large Herman ∗ (3.1M members, other statistics are reported in Table 1).To reason about the pruning eﬃciency of diﬀerent synthesis methods, wewant to avoid feasible synthesis problems, where the order of family explorationcan lead to inconsistent performance. Instead, we will primarily focus on non-feasible problems, where all realizations need to be explored in order to proveunsatisﬁability. The experimental evaluation is presented in three parts. (1) Weevaluate the novel CE construction method and compare it with the MaxSat-basedoracle from [9]. (2) We compare the hybrid synthesis loop with the two baselinesAR and CEGIS. (3) We consider novel hard synthesis instances (multi-propertysynthesis, ﬁnding optimal programs) on instances of the model Herman ∗ . Comparing CE construction methods

We consider the quality of the CEs and their generation time . In particular, we want to investigate (1) whether usingCEs-modulo-families yields better CEes, (2) how the quality of CEs from the smartoracle compares to the MaxSat-based oracle, and how their time consumptioncompares. As a measure of quality of a CE, the average number of its relevantparameters w.r.t. the total number of its parameters is taken. That is, smaller

Grid ∗ Maze ∗ DPM ∗ Pole - 0.87 (0.062) 0.16 - - 309 12 (3, 5) ∗ - 0.54 (0.041) 0.29 - - 615 23 (80, 61) Herman - 0.91 (0.011) 0.50 - - 171 86 (24, 1) ∗ - 0.88 (0.016) 0.87 - - 643 269 (485, 13) CE quality for diﬀerent methods and performance of three synthesis methods.For each model/property, we report results for two diﬀerent thresholds where thesymbol ‘ ∗ ’ marks the one closer to the feasibility threshold, representing the morediﬃcult synthesis problem. Symbol ‘-’ marks a two-hour timeout. CE quality : Thepresented numbers give the CE quality (i.e., the smaller, the better). The numbers inparentheses represent the average run-time of constructing one CE in seconds (run-timesfor constructing CE using non-trivial bounds are similar as for trivial ones and are thusnot reported).

Performance : for each method, we report the number of iterations (forthe hybrid method, the reported values are iterations of the CEGIS and AR oracle,respectively) and the run-time in seconds. ratios imply better CEs. To measure the inﬂuence of using CEs-modulo-families,two types of bounds are used: (i) trivial bounds (i.e., γ = for safety and γ = for liveness properties), and (ii) non-trivial bounds corresponding to the entirefamily R D representing the most conservative estimate. The results are reportedin (the left part of) Table 2. In the next subsection, we investigate this samebenchmark from the point of view of the performance of the synthesis methods,which also shows the immediate eﬀect of the new CE generation strategy.The ﬁrst observation is that using non-trivial bounds (as opposed to trivialones) for the state expansion approach can drastically decrease the conﬂictsize. It turns out that the CEs obtained using the greedy approach are mostlylarger than those obtained with the MaxSat method. However (see Grid ), evenfor trivial bounds, we may obtain smaller CEs than for MaxSat: computinga minimal-command CE does not necessarily induce an optimal conﬂict. Onthe other hand, comparing the run-times in the parentheses, one can see thatcomputing CEs via the greedy state expansion is orders of magnitude faster thancomputing command-optimal ones using MaxSat. It is good to realize that thegreedy method makes at most | K | model-checking queries to compute CEs, whilethe MaxSat method may make exponentially many such queries. Overall, thegreedy method using the non-trivial bounds is able to obtain CEs of comparablequality as the MaxSat method, while being orders of magnitude faster. nductive Synthesis for Probabilistic Programs Reaches New Horizons 15 Performance comparison with AR/CEGIS

We compare the hybrid syn-thesis loop from Sect. 5 with two state-of-the-art baselines: CEGIS and AR. Theresults are displayed in (the right half of) Table 2.

In all 10 cases, the hybridmethod outperforms the baselines. It is up to an order of magnitude faster .Let us discuss the performance of the hybrid method. We classify benchmarksalong two dimensions: (1) the performance of CEGIS and (2) the performance ofAR. Based on the empirical performance, we classify (

Grid ) as good-for-CEGIS(and not for AR),

Maze , Pole and

DPM as good-for-AR (and not for CEGIS),and

Herman as hard (for both). Roughly, AR works well when the quotientMDP does not blow up and its analysis is precise due to consistent schedulers,i.e., when the parameter dependencies are not crucial for a precise analysis.CEGIS performs well when the CEs are small and fast to compute. On the otherhand, synthesis problems for which neither pure CEGIS nor pure AR are able toeﬀectively reason about non-trivial subfamilies, inherently proﬁt from a hybridmethod. The main point we want to discuss is how the hybrid method reinforcesthe strengths of both methods, rather than their weaknesses .In the hybrid method, there are two factors that determine the eﬃciency:(i) how fast do we get bounds on the reachability probability that are tight enough toenable construction of good counterexamples? and (ii) how good are the constructedcounterexamples?

The former factor is attributed to the proposed adaptive scheme(see Alg. 2), where the method will prefer AR-like analysis and continue reﬁnementuntil the computed bounds allow construction of small counterexamples. Thelatter is reﬂected above. Let us now discuss how these two aspects are reﬂectedin the benchmarks.In good-for-CEGIS benchmarks like

Grid , after analyzing a quotient MDPfor the whole family, the hybrid method mostly proﬁts from better CEs yieldingbetter bounds, thus outperforming CEGIS. Indeed, the CEs are found so fastthat the bottleneck is no longer their generation. This also explains why thespeedup is not immediately translated to the speedup on the overall synthesisloop. In the good-for-AR benchmark

DPM , the hybrid method provides only aminor improvement as it has to perform a large number of AR-iterations beforethe novel CE-based pruning can be eﬀectively used. This can be considered as theworst-case scenario for the hybrid method. On other good-for-AR benchmarkslike

Maze and

Pole , the good performance on AR allows to quickly obtain tightbounds which can then be exploited by CEGIS. Finally, in hard models like

Herman , abstraction-reﬁnement is very expensive, but even the bounds from theﬁrst round yield bounds that, as opposed to the trivial bounds, now enable goodCEs: CEGIS can keep using these bounds to quickly prune the state space.