[PDF] DCSYNTH: Guided Reactive Synthesis with Soft Requirements

Abstract

In reactive controller synthesis, a number of implementations (controllers) are possible for a given specification because of the incomplete nature of specification. To choose the most desirable one from the various options, we need to specify additional properties which can guide the synthesis. In this paper, We propose a technique for guided controller synthesis from regular requirements which are specified using an interval temporal logic QDDC. We find that QDDC is well suited for guided synthesis due to its superiority in dealing with both qualitative and quantitative specifications. Our framework allows specification consisting of both hard and soft requirements as QDDC formulas. We have also developed a method and a tool DCSynth, which computes a controller that invariantly satisfies the hard requirement and it optimally meets the soft requirement. The proposed technique is also useful in dealing with conflicting i.e., unrealizable requirements, by making some of them as soft requirements. Case studies are carried out to demonstrate the effectiveness of the soft requirement guided synthesis in obtaining high-quality controllers. The quality of the synthesized controllers is compared using metrics measuring both the guaranteed and the expected case behaviour of the controlled system. Tool DCSynth facilitates such comparison.

Full PDF

DDCSynth: Guided Reactive Synthesis with SoftRequirements

Amol Wakankar , Paritosh K. Pandya , and Raj Mohan Matteplackel Homi Bhabha National Institute, Mumbai, India.Bhabha Atomic Research Centre, Mumbai, India. Tata Institute of Fundamental Research, Mumbai 400005, India.

Abstract.

In reactive controller synthesis, a number of implementations(controllers) are possible for a given speciﬁcation because of incompletenature of speciﬁcation. To choose the most desirable one from the var-ious options, we need to specify additional properties which can guidethe synthesis. In this paper, We propose a technique for guided controllersynthesis from regular requirements which are speciﬁed using an inter-val temporal logic QDDC. We ﬁnd that QDDC is well suited for guidedsynthesis due to its superiority in dealing with both qualitative and quan-titative speciﬁcations. Our framework allows speciﬁcation consisting ofboth hard and soft requirements as QDDC formulas.We have also developed a method and a tool DCSynth, which computesa controller that invariantly satisﬁes the hard requirement and it opti-mally meets the soft requirement . The proposed technique is also use-ful in dealing with conﬂicting i.e., unrealizable requirements, by makingsome of the them as soft requirements. Case studies are carried out todemonstrate the eﬀectiveness of the soft requirement guided synthesis inobtaining high quality controllers. The quality of the synthesized con-trollers is compared using metrics measuring both the guaranteed andthe expected case behaviour of the controlled system. Tool DCSynth fa-cilitates such comparison.

Reactive synthesis aims at constructing a controller (say a Mealy Machine) al-gorithmically from a given temporal logic speciﬁcation of its desired behaviour.Considerable amount of research has gone into the area of reactive synthesis andseveral tools are available for experimenting with reactive synthesis [13]. How-ever, existing tools do not have the capability to guide the synthesis towards themost desirable controller. In practice, user speciﬁcation may be incomplete andit may contain certain requirements, which are not mandatory , but desirable.We term the desirable properties as soft requirements.In this work, we propose a speciﬁcation consisting of hard requirement whichare mandatory and needs to be satisﬁed invariantly , and the soft requirement which are desirable and should be satisﬁed at as many points in the execution aspossible. We choose to specify the hard and soft requirements as regular proper-ties in logic

Quantiﬁed Discrete Duration Calculus ( QDDC ) [20, 21]. QDDC isthe discrete time variant of Duration Calculus proposed by Zhou et.al. [8, 33]. a r X i v : . [ c s . L O ] M a y egular properties can conceptually be speciﬁed by a deterministic ﬁnitestate automaton (DFA). At any point in the execution, a regular property holdsprovided the past behaviour upto the point is accepted by its DFA. The studyof synthesis of controllers for such properties was pioneered by Ramadge andWonham [18, 25, 26]. QDDC is an interval temporal logic, which has the expres-sive power of regular languages. Section 2 presents the syntax and semanticsof this logic. Prior work [17, 20, 21] shows that any formula in QDDC can beeﬀectively translated into a language equivalent DFA over ﬁnite words. LogicQDDC’s bounded counting features, interval based modalities and regular ex-pression like primitives allow complex qualitative and quantitative properties(such as latency, resource constraints) to be speciﬁed succinctly and modularly(see the example below). With illustrations, papers [19, 21, 22] show how quan-titative and qualitative regular properties can be succinctly speciﬁed in QDDC.Paper [19] also gives a comparison with other logics such as LTL and PSL. Itshould be noted that QDDC does not allow speciﬁcation of general liveness prop-erties, however, time bounded liveness can be speciﬁed. The following examplemotivates need for soft requirement guided synthesis. Example 1 (Arbiter for Mutually Exclusive Shared Resource).

The arbiter hasan input r i (denoting request for access) and an output a i (denoting acknowl-edgement for access) for each client 1 ≤ i ≤ n . The speciﬁcation consists of thefollowing two properties, given as QDDC formulas together with their intuitiveexplanation. Section 2 gives the formal syntax and semantics of QDDC. − Mutual Exclusion Requirement R : [[ ∧ i (cid:54) = j ¬ ( a i ∧ a j ) ]], states that at everypoint, the access to the shared resource should be mutually exclusive. − k-cycle Response Requirement R : []( ∧ i (([[ r i ]] && ( slen > = ( k − ⇒ ( scount a i > k cycles ifrequest from i th client ( r i ) is continuously high during the interval, then thatclient should get at least one access ( a i ) within the observation interval. Modality[] D states that sub-formula D should hold for all observation intervals. Term slen gives the length of the observation interval and the term ( scount P ) counts thenumber of occurrences of proposition P within the observation interval. Theproperty R ≤ i ≤ n .When k < n no controller can satisfy both requirements (their conjunctionis unrealizable); consider the case where all clients request all the time. We maywant to opt for an implementation, which mandatorily satisﬁes R R R soft requirement . One possible controller insuch a case will make a i true in a round-robin manner for all the requestingclients. Let r denote the number of clients requesting simultaneously. The round-robin policy will mandatorily satisfy R r ≤ k . But when r > k ( overload condition ), this controller will be able tomeet the response time requirement R k out of n clients. Using thesoft requirement, we can also given priority to the clients which meet responsetime requirements under overload condition . Thus, soft requirements allow us tochoose the preferred one from several candidate controllers. (cid:117)(cid:116) his paper introduces a tool DCSynth which allows synthesis of controllersfrom regular properties (QDDC formulas). The speciﬁcation in

DCSynth is atuple (

I, O, D h , D s ), where D h and D s are QDDC formulas over a set of in-put and output propositions ( I, O ). Here, D h and D s are the hard and the soft requirement, respectively . We use the term supervisor for a non-blockingMealy machine which may non-deterministically produce one or more outputsfor each input. A supervisor may be reﬁned to a sub-supervisor by resolving(pruning) the non-determinstic choice of outputs (the sub-supervisor may useadditional memory for making the choice.) We deﬁne a determinism ordering onsupervisors in the paper. A controller is a deterministic supervisor. Ramadgeand Wonham [25, 26] investigated the synthesis of the maximally permissive supervisor for a regular speciﬁcation. The maximally permissive supervisor isa unique supervisor, which encompasses all the behaviors invariantly satisfyingthe speciﬁed regular property (See Deﬁnition 6). The well known safety synthesisalgorithm applied to the DFA for D h gives us the maximally permissive super-visor M P S ( D h ) [10]. If no such supervisor exists, the speciﬁcation is reportedas unrealizable.Any controller obtained by arbitrarily resolving the nondeterministic choicesfor outputs in M P S ( D h ) is correct-by-construction. This results in several con-trollers with distinct behaviours (as shown by previous example). Thus, onlycorrect-by-construction synthesis is not suﬃcient [3]. Some form of guidancemust be provided to the synthesis method to choose among the possible con-trollers. We use the soft requirements to provide such guidance. Our synthesismethod tries to choose a controller, which satisﬁes the soft requirements ( D s ) “asmuch as possible”. Soft requirement can also specify the desirable requirements,which cannot be met invariantly. For example, in a Mine-pump controller, assoft requirement “keep the pump oﬀ unless mandated by the hard requirement”speciﬁes an energy eﬃcient controller. Speciﬁcation of scheduling, performanceand quality constraints are often such desirable properties. Moreover, a speciﬁ-cation may consist of a conjunction of conﬂicting requirements. In this case, allthe requirements cannot be invariantly met simultaneously. User may resolve theconﬂict(s) by making some of these requirements as soft. Therefore, soft require-ments give us a capability to synthesize meaningful and practical controllers.In DCSynth, we formalize the notion of a controller meeting the soft require-ment D s “as much as possible”, by synthesizing a sub-supervisor of M P S ( D h )(guaranteeing invariance of D h ), which maximizes the expected value of countof D s in next H moves when averaged over all the inputs. The classical valueiteration algorithm due to Bellman [2] allows us to compute this H -optimalsub-supervisor. This can be further reﬁned to a controller as desired. Thus, oursynthesis method gives a controller which, (a) invariantly satisﬁes D h and (b) itis H -optimal for D s amongst all controllers meeting condition (a). The above synthesis method is implemented in tool DCSynth. An eﬃcientrepresentation of DFA using BDDs, originally introduced by the tool MONA [15], The tool supports more general lexicographically ordered list of soft requirements.However, we omit the general case for brevity s used for representing both automata and supervisors. We adapt the safety syn-thesis algorithm and the value iteration algorithm so that they work symbolicallyover this MONA DFA representation.We illustrate our speciﬁcation method and synthesis tool with the help of twocase studies . We deﬁne metrics to compare the controllers for their guaranteed and expected behaviour . The tool DCSynth facilitates measurement of both thesemetrics. The main contributions of this paper are as follows: – We develop a technique for the synthesis of controllers from QDDC require-ments. This extends the past work on model checking interval temporal logicQDDC [7, 17, 20, 21, 29] with synthesis abilities. – We propose a method for guided synthesis of controllers based on soft re-quirements which are met in a H -optimal fashion. Conceptually, this en-hances the Ramadge-Wonham framework for optimal controller synthesis. – We present a tool DCSynth for guided synthesis, which • represents and manipulates automata/supervisors using BDD-based semi-symbolic DFA. It uses eager minimization for eﬃcient synthesis, and • provides facility to compare both the guaranteed and expected case be-haviours of the candidate controllers. – We analyse the impact of soft requirements on the quality of the synthesizedcontrollers experimentally using case studies.The rest of the paper is arranged as follows. Section 2 describes the syntaxand semantics of QDDC. Important deﬁnitions are presented in Section 3. Syntaxof DCSynth speciﬁcation and the controller synthesis method are presented inSection 4. Section 5 discusses case studies and experimental results. The paperis concluded with a discussion and related work in Sections 6 and 7.

Let

P V be a ﬁnite non-empty set of propositional variables. Let σ a non-emptyﬁnite word over the alphabet 2 P V . It has the form σ = P · · · P n where P i ⊆ P V for each i ∈ { , . . . , n } . Let len ( σ ) = n + 1, dom ( σ ) = { , . . . , n } , σ [ i, j ] = P i · · · P j and σ [ i ] = P i .The syntax of a propositional formula over variables P V is given by: ϕ := f alse | true | p ∈ P V | ! ϕ | ϕ && ϕ | ϕ || ϕ with && , || , ! denoting conjunction, dis-junction and negation, respectively. Op-erators such as ⇒ and ⇔ are deﬁned as usual. Let Ω ( P V ) be the set of allpropositional formulas over variables

P V . Let i ∈ dom ( σ ). Then the satisfactionof propositional formula ϕ at point i , denoted σ, i | = ϕ is deﬁned as usual andomitted here for brevity.The syntax of a QDDC formula over variables P V is given by: D := (cid:104) ϕ (cid:105) | [ ϕ ] | [[ ϕ ]] | D ^ D | ! D | D || D | D && Dex p. D | all p. D | slen (cid:46)(cid:47) c | scount ϕ (cid:46)(cid:47) c | sdur ϕ (cid:46)(cid:47) c DCSynth can be downloaded at [31] along with the speciﬁcation ﬁles for experiments. here ϕ ∈ Ω ( P V ), p ∈ P V , c ∈ N and (cid:46)(cid:47) ∈ { <, ≤ , = , ≥ , > } .An interval over a word σ is of the form [ b, e ] where b, e ∈ dom ( σ ) and b ≤ e . Let Intv ( σ ) be the set of all intervals over σ . Let σ be a word over 2 P V and let [ b, e ] ∈ Intv ( σ ) be an interval. Then the satisfaction relation of a QDDCformula D over Σ and interval [ b, e ] written as σ, [ b, e ] | = D , is deﬁned inductivelyas follows: σ, [ b, e ] | = (cid:104) ϕ (cid:105) iﬀ b = e and σ, b | = ϕ,σ, [ b, e ] | = [ ϕ ] iﬀ b < e and ∀ b ≤ i < e : σ, i | = ϕ,σ, [ b, e ] | = [[ ϕ ]] iﬀ ∀ b ≤ i ≤ e : σ, i | = ϕ,σ, [ b, e ] | = iﬀ e = b + 1 and σ, b | = ϕ,σ, [ b, e ] | = D ^ D iﬀ ∃ b ≤ i ≤ e : σ, [ b, i ] | = D and σ, [ i, e ] | = D , with Boolean combinations ! D , D || D and D && D deﬁned in the expectedway. We call word σ (cid:48) a p -variant, p ∈ P V , of a word σ if ∀ i ∈ dom ( σ ) , ∀ q (cid:54) = p : q ∈ σ (cid:48) [ i ] ⇔ q ∈ σ [ i ]. Then σ, [ b, e ] | = ex p. D ⇔ σ (cid:48) , [ b, e ] | = D for some p -variant σ (cid:48) of σ and ( all p. D ) ⇔ (! ex p. ! D ).Entities slen , scount and sdur are called terms . The term slen gives thelength of the interval in which it is measured, scount ϕ where ϕ ∈ Ω ( P V ),counts the number of positions including the last point in the interval underconsideration where ϕ holds, and sdur ϕ gives the number of positions exclud-ing the last point in the interval where ϕ holds. Formally, for ϕ ∈ Ω ( P V )we have slen ( σ, [ b, e ]) = e − b , scount ( σ, ϕ, [ b, e ]) = (cid:80) i = ei = b (cid:26) , if σ, i | = ϕ, , otherwise. (cid:27) and sdur ( σ, ϕ, [ b, e ]) = (cid:80) i = e − i = b (cid:26) , if σ, i | = ϕ, , otherwise. (cid:27) We also deﬁne the following derived constructs: pt = (cid:104) true (cid:105) , ext =! pt , (cid:104)(cid:105) D = true ^ D ^ true , [] D = (! (cid:104)(cid:105) ! D ) and pref ( D ) =!((! D ) ^ true ). Thus, σ, [ b, e ] | = [] D iﬀ σ, [ b (cid:48) , e (cid:48) ] | = D for all sub-intervals b ≤ b (cid:48) ≤ e (cid:48) ≤ e and σ, [ b, e ] | = pref ( D ) iﬀ σ, [ b, e (cid:48) ] | = D for all preﬁx intervals b ≤ e (cid:48) ≤ e . Deﬁnition 1 (Language of a formula).

Let σ, i | = D iﬀ σ, [0 , i ] | = D , and σ | = D iﬀ σ, len ( σ ) − | = D . We deﬁne L ( D ) = { σ | σ | = D } , the set ofbehaviours accepted by D . Formula D is called valid, denoted | = dc D , iﬀ L ( D ) =(2 P V ) + . (cid:117)(cid:116) Thus, a formula D holds at a point i in a behaviour provided the past of thepoint i satisﬁes D . Theorem 1. [21] For every formula D over variables P V we can constructa Deterministic Finite Automaton (DFA) A ( D ) over alphabet P V such that L ( A ( D )) = L ( D ) . We call A ( D ) a formula automaton for D or the monitorautomaton for D . (cid:117)(cid:116) A tool DCVALID implements this formula automaton construction in an eﬃcientmanner by internally using the tool MONA [15]. It gives minimal, deterministic automaton (DFA) for the formula D . We omit the details here. The reader mayrefer to several papers on QDDC for detailed description and examples of QDDCspeciﬁcations as well as its model checking tool DCVALID [7, 19–21, 29]. Supervisor and Controller

In this section we present QDDC formulas and automata where variables

P V = I ∪ O are partitioned into disjoint sets of input variables I and output variables O .It is known that supervisors and controllers can be expressed as Mealy machineswith special properties. Here we show how Mealy machines can be represented asspecial form of Deterministic ﬁnite automata (DFA). This representation allowsus to use the MONA DFA library [15] to compute supervisors and controllerseﬃciently using our tool DCSynth. Deﬁnition 2 (Output-nondeterministic Mealy Machine).

A total andDeterministic Finite Automaton (DFA) over input-output alphabet Σ = 2 I × O is a tuple A = ( Q, Σ, s, δ, F ) having conventional meaning, where δ : Q × I × O → Q . An output-nondeterministic Mealy machine is a DFA with aunique reject (or non-ﬁnal) state r which is a sink state i.e., F = Q − { r } and δ ( r, i, o ) = r for all i ∈ I , o ∈ O . (cid:117)(cid:116) The intuition behind this deﬁnition is that the transitions from q ∈ F to r areforbidden (and kept only for making the DFA total). The language of any suchMealy machine is preﬁx-closed. Recall that for a Mealy machine, F = Q − { r } .A Mealy machine is deterministic if ∀ s ∈ F , ∀ i ∈ I , ∃ at most one o ∈ O such that δ ( s, i, o ) (cid:54) = r . Deﬁnition 3 (Non-blocking Mealy Machine).

An output-nondeterministicMealy machine is called non-blocking if ∀ s ∈ F , ∀ i ∈ I ∃ o ∈ O such that δ ( s, i, o ) ∈ F . It follows that for all input sequences a non-blocking Mealy ma-chine can produce one or more output sequence without ever getting into thereject state. (cid:117)(cid:116) For a Mealy machine M over variables ( I, O ), its language L ( M ) ⊆ (2 I × O ) ∗ . Aword σ ∈ L ( M ) can also be represented as pair ( ii, oo ) ∈ ((2 I ) ∗ , (2 O ) ∗ ) such that σ [ k ] = ii [ k ] ∪ oo [ k ] , ∀ k ∈ dom ( σ ). Here σ, ii, oo must have the same length. Notethat in the rest of this paper, we do not distinguish between σ and ( ii, oo ). Also,for any input sequence ii ∈ (2 I ) ∗ , we deﬁne M [ ii ] = { oo | ( ii, oo ) ∈ L ( M ) } . Deﬁnition 4 (Controllers and Supervisors).

An output-nondeterministicMealy machine which is non-blocking is called a supervisor . An deterministicsupervisor is called a controller . (cid:117)(cid:116) The non-deterministic choice of outputs in a supervisor denotes unresolved de-cision. The determinism ordering deﬁned below allows supervisors to be reﬁnedinto controllers.

Deﬁnition 5 (Determinism Order and Sub-supervisor).

Given two su-pervisors S and S , we say S ≤ det S ( S is more deterministic than S ), iﬀ L ( S ) ⊆ L ( S ) . We call S to be a sub-supervisor of S . (cid:117)(cid:116) ote that being supervisors, they are both non-blocking and hence ∅ ⊂ S [ ii ] ⊆ S [ ii ] for any ii ∈ (2 I ) ∗ . The supervisor S may make use of additional memoryfor resolving and pruning the non-determinism in S .For technical convenience, we deﬁne a notion of indicator variable for aQDDC formula (regular property). The idea behind this is that the indicatorvariable w witnesses the truth of a formula D at any point in execution. Thus, Ind ( D, w ) = pref ( EP ( w ) ⇔ D )Here, EP ( w ) = ( true ^ (cid:104) w (cid:105) ), i.e. EP ( w ) holds at a point i , if variable w is trueat that point i . Hence, w will be true exactly on those points where D is true .The formula automaton A ( Ind ( D, w )) gives us a controller with input-outputalphabet ( I ∪ O, w ) such that it outputs w = 1 on a transition iﬀ the past satisﬁes D . Since our formula automata are minimal DFA, A ( Ind ( D, w )) characterizesthe least memory needed to track the truth of formula D . This section deﬁnes the DCSynth speciﬁcation and presents the algorithm usedin our tool DCSynth for soft requirement guided controller synthesis from aDCSynth speciﬁcation. The process of synthesizing a controller as discussed inSection 4.4 uses three main algorithms given in Sections 4.1-4.3.

A QDDC formula D speciﬁes a regular property which may hold intermittentlyduring a behaviour (see Deﬁnition 1). An important class of properties, denotedby inv D , states that D must hold invariantly during the system behaviour. Deﬁnition 6.

Let

S realizes inv D denote that a supervisor S realizes invari-ance of QDDC formula D over variables ( I, O ) . Deﬁne S realizes inv D pro-vided L ( S ) ⊆ L ( D ) . Recall that, by the deﬁnition of supervisors, S must benon-blocking. A supervisor S for a formula D is called maximally permissive iﬀ S ≤ det S (cid:48) holds for any supervisor S (cid:48) such that S (cid:48) realizes inv D . This S (when it exists) is unique upto language equivalence of automata, and the mini-mum state maximally permissive supervisor is denoted as MPS ( D ) . (cid:117)(cid:116) Now, we discuss how

M P S ( D ) for a given QDDC formula D is computed.1. Language equivalent DFA A ( D ) = (cid:104) S, I ∪ O , δ, F (cid:105) is constructed for formula D (Theorem 1). The standard safety synthesis algorithm [12] over A ( D )gives us the desired M P S ( D ) as outlined in the following steps.2. We ﬁrst compute the largest set of winning states G ⊆ F with the followingproperty: s ∈ G iﬀ ∀ i ∃ o : δ ( s, ( i, o )) ∈ G . Let Cpre ( A ( D ) , X ) = { s | ∀ i ∃ o : δ ( s, ( i, o )) ∈ X } . Then we iteratively compute G as follows:G=F; d o G1=G;G=Cpre( A ( D ),G1); w hile (G != G1);. If initial state s / ∈ G , then the speciﬁcation is unrealizable . Otherwise, M P S ( D ) is obtained by declaring G as the set of ﬁnal states and retainingall the transitions in A ( D ) between states in G and redirecting the remainingtransitions of A ( D ) to a unique reject state r which is made a sink state. Proposition 1.

For a given QDDC formula D the above algorithm computesthe maximally permissive supervisor M P S ( D ) . (cid:117)(cid:116) The proposition follows straightforwardly by combining Theorem 1 with thecorrectness of standard safety synthesis algorithm [12]. We omit a detailed proof. H -Optimal Supervisor (MPHOS) Given a supervisor S and a desired QDDC formula D which should hold “as muchas possible” (both are over input-output variables ( I, O )), we give a method forconstructing an “optimal” sub-supervisor of S , which maximizes the expectedvalue of count of D holding in next H moves when averaged over all the inputs.First consider A Arena = S × A ( Ind ( D, w )) which is a supervisor over input-output variables (

I, O ∪{ w } ). It augments S by producing an additional output w which witnesses the truth of D . It has the property: L ( A Arena ) ↓ ( I ∪ O ) = L ( S ).Also for σ ∈ L ( A Arena ) and i ∈ dom ( σ ) we have w ∈ σ [ i ] iﬀ σ [0 : i ] | = D . Thus,every transition of A Arena is labelled with w iﬀ D holds on taking the transition.Let the weight of transitions labelled with w be 1 and 0 otherwise. Thus, for o ∈ ( O ∪{ w } ) let wt ( o ) = 1 if w ∈ o and 0 otherwise. Technically, this makes A Arena a weighted automaton.In the supervisor A Arena = (

Q, Σ, s, δ, Q − { r } ), where r is the unique re-ject state, we deﬁne for ( q ∈ Q ) (cid:54) = r and i ∈ I , set LegalOutputs ( q, i ) = { o | δ ( q, i ) (cid:54) = r } . We also deﬁne a deterministic selection rule as function f s.t. f ( q, i ) ∈ LegalOutputs ( q, i ) and a non-deterministic selection rule F as function F s.t. F ( q, i ) ∈ { O ⊆ LegalOutputs ( q, i ) | O (cid:54) = ∅} . Let H be a natural num-ber. Then H -horizon policy π is a sequence F , F , . . . , F H of non-deterministicselection rules. A deterministic policy will use only deterministic selection rules.A policy is stationary (memory-less) if each F i is the same independently of i .Given a state s , a policy π and an input sequence ii ∈ (2 I ) H (of length H ), we deﬁne L ( A Arena , ii, s ) as all runs of A Arena over the input ii startingfrom state s and L π ( A Arena , ii, s ) as all runs over input ii starting from s andfollowing the selection rule F i at step i . Each run has the form ( ii, oo ). Let V alue ( ii, oo ) = Σ ≤ i ≤ ii wt ( oo [ i ])). Thus, V alue ( ii, oo ) gives the count of D holding during behaviour fragment ( ii, oo ). Then, we deﬁne V M IN π ( s, ii ) = min { V alue ( ii, oo ) | ( ii, oo ) ∈ L π ( A Arena , ii, s ) } , which gives the minimumpossible count of D among all the runs of S under policy π on input ii , startingwith state s . We also deﬁne, V M AX ( s, ii ) = max { V alue ( ii, oo ) | ( ii, oo ) ∈ L ( A Arena , ii, s ) } , which gives the maximum achievable count. Note that V M AX is independent of any policy.Given a horizon value (natural number) H , A Arena and a non-deterministic H -horizon policy π , we deﬁne utility values V alAvgM in π ( s ) and V alAvgM ax ( s )for each state s of A Arena as follows. alAvgM ax ( s ) = E ii ∈ (2 I ) H V M AX ( s, ii ) V alAvgM in π ( s ) = E ii ∈ (2 I ) H V M IN π ( s, ii )Thus, intuitively, V alAvgM ax ( s ) gives the maximal achievable count of D fromstate s , when averaged over all inputs of length H . Similarly, V alAvgM in π ( s )gives the minimal such count for D under policy π , when averaged over all inputsof length H . Our aim is to construct a horizon- H policy π ∗ = argmax π V alAvgM in π ( s ).This will turn out to be a stationary policy given by a selection rule F ∗ . Thisrule can be implemented as a supervisor denoted by M P HOS ( A Arena , H ). Wenow give its construction.The well known value iteration algorithm allows us to eﬃciently compute

V alAvgM ax ( s ) as recursive function V al ( s, H ) below. V al ( s,

0) = 0

V al ( s, p + 1) = E i ∈ I max o ∈ ( O ∪{ w } ) : δ ( s, ( i,o )) (cid:54) = r { wt ( o ) + V al ( δ ( s, ( i, o )) , p ) } We omit the straightforward proof that

V al ( s, H ) = V alAvgM ax ( s ) (see [24]).Having computed this, the optimal selection rule F ∗ giving stationary policy π ∗ is given as follows: For each state s ∈ A Arena and each input i ∈ I , F ∗ ( s, i ) = argmax o ∈ O { wt ( o ) + V al ( s, H ) | δ A Arena ( s, ( i, o )) = s (cid:48) ∧ s (cid:48) (cid:54) = r } Note that F ∗ ( s, i ) is non-deterministic as more than one output o may satisfythe argmax condition. The following well-known lemma states that stationarypolicy π ∗ using the selection rule F ∗ is H -optimal. Lemma 1.

For all states s of A Arena , V alAvgM in π ∗ ( s ) = V alAvgM ax ( s ) always holds. Therefore, for all states s of A Arena and for any H -horizon policy π , V alAvgM in πH ( s ) ≤ V alAvgM in π ∗ ( s ) also holds. (cid:117)(cid:116) We omit the proof of these well known properties from optimal control of MarkovDecision Processes (see [24]).Supervisor A Arena is pruned to retain only the transitions with the out-puts in set F ∗ ( s, i ) (as these are all equally optimal). This gives us Maximallypermissive H -Optimal sub-supervisor of A Arena w.r.t. D . This supervisor is de-noted by M P HOS ( A Arena , H ) or equivalently

M P HOS ( S, D, H ). The follow-ing proposition follows immediately from the construction of

M P HOS ( S, D, H )and Lemma 1.

Proposition 2. S ≤ det M P HOS ( S, D, H ) , for all H .2. M P HOS ( S, D, H ) is maximally permissive H -optimal sub-supervisor of S .3. If M P HOS ( S, D, H ) ≤ det S (cid:48) then S (cid:48) is H -Optimal. (cid:117)(cid:116) A controller

Cnt can be obtained from a supervisor S by resolving output non-determinism in S . We give a rather straightforward mechanism for this. Wellow the user to specify an ordering Ord on the set of output variables 2 O . Agiven supervisor S is determinized by retaining only the highest ordered outputamong those permitted by S . This is denoted Det

Ord ( S ). The output orderingis speciﬁed by giving a lexicographically ordered list of output variable literals.This facility is used to determinize M P HOS and

M P S supervisors as required.

Example 2.

For a supervisor S over variables ( I, { o , o } ), an example outputorder can be given as lexicographically ordered list ( o > ! o ). Then, for anytransition the determinization step will try to select the highest ordered output(which is allowed by S ) from the list { ( o = true, o = f alse ), ( o = true, o = true ), ( o = f alse, o = f alse ), ( o = f alse, o = true ) } . (cid:117)(cid:116) A DCSynth speciﬁcation is a tuple (

I, O, D h , D s ), where I and O are theset of input and output variables, respectively. Formula D h called the hard re-quirement and formula D s called the soft requirement are QDDC formulas overthe set of propositions P V = I ∪ O . Let H be a natural number called Horizon.The objective in DCSynth is to synthesize a deterministic controller which (a) invariantly satisﬁes the hard requirement D h , and (b) it is H Optimal w.r.t. D s amongst all the controllers satisfying (a).Given a speciﬁcation ( I, O, D h , D s ), a horizon value H (a natural number)and a total ordering Ord on the set of outputs 2 O , the controller synthesis inDCSynth can be given as Algorithm 1. Algorithm 1

ControllerSynthesisInput : S = ( I, O, D h , D s ) . Horizon H , Output ordering Ord

Output : Controller Cnt for S.1. A mps = M P S ( D h ) A mphos = M P HOS ( A mps , D s , H ) Cnt = Det ord ( A mphos ) .4. Encode the automaton Cnt in an implementation language.

Step 1 uses the MPS construction given in Section 4.1. Step 2 uses the MPHOSconstruction given in Section 4.2 whereas Step 3 uses the determinization methodof Section 4.3.

Proposition 3.

The controller

Cnt output by Algorithm 1 invariantly satisﬁes D h , and it intermittently, but H -optimally, satisﬁes D s .Proof. By Proposition 1, A mps realizes inv D h . Then, by Proposition 2, A mphos and Cnt are sub-supervisors of A mps and hence they also realize inv D h . More-over, by Lemma 1, we get that A mphos is H -optimal w.r.t. D s . Hence, by Propo-sition 2, we get that Cnt which is a sub-supervisor of A mphos is also H -Optimalwith respect to D s . (cid:117)(cid:116) At all stages of above synthesis, the automata/supervisors A ( D h ), A ( D s ), A mps and A mphos and Cnt are all represented as semi-symbolic automata (SSDFA)sing the MONA [15] DFA data structure. In this representation, the transitionfunction is represented as a multi-terminal BDD. MONA DFA library providesa rich set of automata operations including product, projection, determinizationand minimization over the SSDFA. The algorithms discussed in Sections 4.1, 4.2and 4.3 are implemented over SSDFA. Moreover, these algorithms are adapted towork without actually expanding the speciﬁcation automata into game graph.At each stage of computation, the automata and supervisors are aggressivelyminimized, which leads to signiﬁcant improvement in the scalability and compu-tation time of the tool. Appendix D gives the details of SSDFA data structureand its use in symbolic computation of supervisors in eﬃcient manner.

For a DCSynth speciﬁcation, D h and D s can be any QDDC formulas. Whileinvariance of D h is guaranteed by the synthesis algorithm, the quality of thecontroller is controlled by optimizing the outputs for which the soft requirement D s holds. For example, D s may specify outputs which save energy, giving anenergy eﬃcient controller. The soft requirement can also be used to improvethe robustness [3] of the controller (see [30]). Below, we consider speciﬁcationsstructured as assumptions and commitments, and their optimized robustnessusing our soft requirement guided synthesis. For many examples, the controller speciﬁcation can be given as a pair (

A, C )of QDDC formulas over input-output variables (

I, O ). Here, commitment C isa formula specifying the desired behaviour which must ideally hold invariantly.But this may be unrealizable, and a suitable assumption A on the behaviour ofenvironment may have to be made for C to hold. In case the assumption A doesnot hold, it is still desirable that controller satisﬁes C , intermittently but “asmuch as possible”. Given this assumption-commitment pair ( A, C ), we specifyfour types of derived controller speciﬁcations (

I, O, D h , D s ) as follows. Type Hard Requirement D h Soft Requirement D s Type0

C true

Type1 ( A ⇒ C ) true Type2 true C

Type3 ( A ⇒ C ) C Type0 controller gives the best guarantee but it may be unrealizable.

Type1 con-troller provides a ﬁrm but conditional guarantee.

Type2 controller tries to achieve C in H -optimal fashion irrespective of any assumption and Type3

Controllerprovides ﬁrm conditional guarantee and it also tries to satisfy C in H -optimalfashion even when the assumption does not hold. For the same assumption commitment pair (

A, C ), we can synthesize diversecontrollers using diﬀerent speciﬁcation types, horizon values and output order-ings. In order to compare the performance of these diﬀerent controllers, we deﬁnetwo metrics – i)

Expected Case Performance measure to compare average casebehaviour, and ii)

Must Dominance to compare the guaranteed behaviour. ) Expected Case Performance:

Given a controller

Cnt over input-output alpha-bet (

I, O ) and a QDDC formula (regular property) C over variables I ∪ O , wecan construct a Discrete Time Markov Chain (DTMC) , denoted M unif ( Cnt, C ),whose analysis allows us to measure the probability of C holding in long runs(steady state) of Cnt under random independent and identically distributed(iid)inputs. This value is designated as E unif ( Cnt, C ). The construction of the desiredDTMC is as follows. The product

Cnt ×A ( C ) gives a ﬁnite state automaton withthe same behaviours as Cnt . Moreover, it is in accepting state exactly when C holds for the past behaviour. (Here A ( C ) works as a total deterministic monitorautomaton for C without restricting Cnt ). By assigning uniform discrete prob-abilities to all the inputs from any state, we obtain the DTMC M unif ( Cnt, C )along with a designated set of accepting states, such that the DTMC is in ac-cepting state precisely when C holds. Standard techniques from Markov chainanalysis allow us to compute the probability (Expected value) of being in the setof accepting states on long runs (steady state) of the DTMC. This gives us thedesired value E unif ( Cnt, C ). A leading probabilistic model checking tool MRMCimplements this computation [14]. In DCSynth, we provide a facility to compute M unif ( Cnt, C ) in a format accepted by the tool MRMC. Hence, using DCSynthand MRMC, we are able to compute E unif ( Cnt, C ). ii) Guaranteed Performance as Must-Dominance: Consider two supervisors S , S and a regular property C . Deﬁne that S i guarantees C for an input sequence ii , provided for every output sequence oo ∈ S i [ ii ] produced by S i on ii we havethat ( ii, oo ) satisﬁes C . We say that S must dominance S with respect to theproperty C provided for every input sequence ii , if S guarantees C then S alsoguarantees C . Thus, S provides a superior must guarantee of C than S . Deﬁnition 7 (Must Dominance).

Given two supervisors S , S and a prop-erty (formula) C over input-output alphabet ( I, O ) , the must dominance of S over S is deﬁned as S ≤ Cdom S iﬀ M ustInp ( S , C ) ⊆ M ustInp ( S , C ) ,where M ustInp ( S i , C ) = { ii ∈ (2 I ) + | ∀ oo ∈ (2 O ) + . (( ii, oo ) ∈ L ( S i ) ⇒ ( ii, oo ) | = C } . (cid:117)(cid:116) We establish must dominance relations among MPHOS supervisors of varioustypes of speciﬁcations discussed in Section 5.1.

Lemma 2.

For any QDDC formulas A and C , and any horizon H , the followingmust dominance relations will hold (for any given H ) MP HOS ( A, C )) ≤ Cdom

MP HOS ( A, C )) ≤ Cdom

MP HOS ( A, C )) MP HOS ( A, C )) ≤ Cdom

MP HOS ( A, C )) where, M P HOS i ( A, C ) denote the maximally permissive H -optimal supervisor A MP HOS of Algorithm 1 for the speciﬁcation

T ype i ( A, C ) .Proof. By deﬁnition,

MP HOS ( A, C ) invariantly satisﬁes C for all input sequences.Hence, MustInp ( MP HOS ( A, C ) , C ) = (2 I ) ∗ , which immediately gives us that S ≤ Cdom

MP HOS ( A, C )) for any supervisor S .ow we prove the remaining relation MP HOS ( A, C )) ≤ Cdom

MP HOS ( A, C )) . Let S = MP S ( A ⇒ C ) . Then, MP HOS ( A, C )) =

MP HOS ( S, true, H ) = S . The second equal-ity holds as soft requirement true does not cause any pruning of outputs in H -optimal computation. By deﬁnition MP HOS ( A, C ) =

MP HOS ( S, C, H ) . By Propo-sition 2, S ≤ det MP HOS ( S, C, H ) which gives us the required result. (cid:117)(cid:116) Note that in general,

MP HOS ( A, C ) is theoretically incomparable with MP HOS ( A, C ) and MP HOS ( A, C ) , as MP HOS ( A, C ) is a supervisor that does not have to meetany hard requirement, but it optimally meets the soft requirements irrespectiveof the assumption. However, for speciﬁc ( A, C ) instances, some additional must-dominance relations may hold between

MP HOS ( A, C ) and the other supervisors. We have carried out experiments with i) the Mine-pump speciﬁcation presentedin this section, and ii) an Arbiter speciﬁcation given in Appendix A.1.

Mine-pump:

The Mine-pump controller (see [21]) has two input sensors: highwater level sensor

HH2O and methane leakage sensor

HCH4 ; and one output,

PUMPON to keep the pump on. The objective of the controller is to safely operatethe pump in such a way that the water level never remains high continuouslyfor more that w cycles. Thus, Mine-pump controller speciﬁcation has input andoutput variables ( { HH O, HCH } , { P UMP ON } ) .We have following assumptions on the mine and the pump . Their conjunc-tion is denote M ineAssume ( (cid:15), ζ, κ ) with integer parameters (cid:15), ζ, κ . Being of theform [] D each formula states that the property D (described in text) holds forall observation intervals in past.- Pump capacity: ([]!( slen = (cid:15) && ([[ P UMP ON && HH O ]] ^ (cid:104) HH O (cid:105) ))) . If the pumpis continuously on for (cid:15) cycles with water level also continuously high, thenwater level will not be high at the (cid:15) + cycle.- Methane release: [](([

HCH ^ [! HCH ^ (cid:104) HCH (cid:105) ) ⇒ ( slen > ζ )) and []([[ HCH ⇒ slen < κ ) . The minimum separation between the two leaks of methane is ζ cycles and the methane leak cannot persist for more than κ cycles.The commitments are as follows. The conjunction of commitments is denotedby M ineCommit ( w ) and they hold intermittently in absence of assumption.- Safety conditions: true ^ (cid:104) (( HCH || ! HH O ) ⇒ ! P UMP ON )) (cid:105) states that if there isa methane leak or absence of high water in current cycle, then pump shouldbe oﬀ in the current cycle. Formula !( true ^ ([[ HH O ]] && slen = w )) states thatthe water level does not remain continuously high in last w + 1 cycles.The Mine-Pump speciﬁcation denoted by M ineP ump ( w, (cid:15), ζ, κ ) is given by theassumption-commitment pair ( M ineAssume ( (cid:15), ζ, κ ) , M ineCommit ( w )). The fourtypes of DCSynth speciﬁcations of Section 5.1 can be derived from this. Figure3 in Appendix B gives the textual source of T ype M ineP ump (8 , , , Arbiter:

Due to space limitations, the detailed speciﬁcation of the arbiter,brieﬂy discussed as Example 1 in Section 1, is given in Appendix A.1. Thearbiter is denoted as

Arb ( n, k, r ), where n denotes the number of clients, k is the able 1. Synthesis from Mine-pump(8,2,6,2) and Arb(5,3,2) speciﬁcations in DCSynth.The last column gives the expected value of commitment in long run on random inputs.

DCSynth Speciﬁcation Synthesis (States/Time)Sr Controller Output MPS MPHOS Controller ExpectedNo type Ordering Stats Stats Stats ValueMine − pump ( , , , )1 T ype

T ype P UMP ON

T ype

P UMP ON ) 70/0.00045 70/0.00254 47/0.00230 0.06

T ype

P UMP ON ) 1/0.00004 10/0.00545 10/0.00019 0.998057

T ype

P UMP ON ) 70/0.00045 75/0.044216 73/0.00082 0.99805

Arb ( , , )1 T ype

T ype ArbDef

T ype ArbDef response time (time for which a client should keep the request high continuouslyto get the guaranteed access) and r is the maximum number of request that canbe true simultaneously. Given an assumption-commitment pair (

A, C ) the four types of DCSynth spec-iﬁcations can be derived as given in Section 5.1. Given any such speciﬁcation,a horizon value H , and an ordering of outputs, a controller can be synthesizedusing our tool DCSynth as described in Section 4.4. For the Mine-pump in-stance

M ineP ump (8 , , , H = 50 and output ordering P U M P ON .These controllers choose to get rid of water aggressively by keeping the pump onwhenever possible. Similarly, controllers were also synthesized with the outputordering !

P U M P ON . These controllers save energy by keeping the pump oﬀwhenever possible. Note that, in our synthesis method, hard and soft require-ments are fulﬁlled before applying the output orderings.For the Arbiter instance

Arb (5 , ,

2) also, controllers were synthesized forall the four derived speciﬁcation types with horizon value H = 50 and outputordering ArbDef = ( a > a > a > a > a ). This ordering tries to giveacknowledgment such that client i has priority higher than client j for all i < j .In Table 1 we give the performance of the of tool DCSynth in synthesizingthese controllers. The table gives the time taken at each stage of the synthesisalgorithm, and the sizes of the computed supervisors/controllers. The experi-ments were conducted on Linux (Ubuntu 16.04) system with Intel i5 64 bit, 2.5GHz processor and 4 GB memory. Experimental Evaluation of Expected Case Performance:

The last column ofTable 1 gives the expected value of commitment holding in long run for thecontrollers of various types for both Mine-pump and Arbiter instances. Thisvalue is computed as outlined in Section 5.2. The results are quite encouraging.t can be observed from Table 1 that in both the examples, the controllers for

Type1 (i.e., when soft-requirements are not used) speciﬁcations have 0 expectedvalue of commitment C . This is because of the strong assumptions used in guar-anteeing C , which themselves have expected value 0. In such a case, wheneverthe assumption fails, the synthesis algorithm has no incentive to try to meet C .On the other hand, with soft requirement C in Type2 and

Type3 speciﬁca-tions, the H -optimal controllers have the expected value of C above 99%. Thisremarkable increase in the expected value of Commitment shows that H -optimalsynthesis is very eﬀective in ﬁguring out controllers which meet the desirableproperty C as much as possible, irrespective of the assumption. Experimental Evaluation of Must-Dominance:

Given supervisors S , S for anassumption-commitment pair ( A, C ), since both S , S are ﬁnite state Mealymachines and C is a regular property, an automata theoretic technique can au-tomatically check whether S ≤ Cdom S . We omit the details of this techniquehere, which is presented in Appendix C Proposition 4. This technique is imple-mented in our tool DCSynth. In case S ≤ Cdom S does not hold, the tool providesa counter example.For our case studies, we experimentally compare must dominance of super-visors M P HOS i ( A, C ) as deﬁned in Lemma 2. Recall that

M P HOS i ( A, C )denotes the maximally permissive H -optimal supervisor for the speciﬁcation T ype i ( A, C ). The results obtained (with H = 50) are as follows.1. Mine-pump instance M inepump (8 , , ,

2) denoted by

M P (8 , , , MP HOS ( MP (8 , , , < Cdom

MP HOS ( MP (8 , , , Cdom

MP HOS ( MP (8 , , ,

2. Arbiter instance

Arb (5 , , MP HOS ( Arb (5 , , < Cdom

MP HOS ( Arb (5 , , Cdom

MP HOS ( Arb (5 , , M P HOS must dominates M P HOS as expected, as M P HOS is a sub-supervisor of M P HOS . What is interesting and surprising is that in both thecase studies Arbiter and Mine-pump, the M P HOS and M P HOS supervisorsare found to be syntactically identical. This is not theoretically guaranteed, as T ype

T ype H -optimal M P HOS already provides all the must-guarantees ofthe hand-crafted M P HOS hard requirements. The H -optimization of C seemsto exhibit startling ability to guarantees C without human intervention. It willbe our attempt to validate this with more examples in future. So far we haveconsidered commitment as soft requirement. In general, the soft requirementcan be used to optimize MPS w.r.t. any regular property of interest, where asthe hard requirements gives the necessary must guarantees. Such soft require-ments may embody performance and quality goals. Hence, it is advisable to usethe combination of hard and soft requirement based on the criticality of eachrequirement. Reactive synthesis from Linear Temporal Logic (LTL) speciﬁcation is a widelystudied area [3] and a considerable number of tools [5, 11] supported by the-oretical foundations are available. The leading tools such as Acacia+ [5] andoSy [11] mainly focus on the future fragment of LTL. In contrast, this paperfocuses on invariance of complex regular properties, denoted by inv D h where D h is a QDDC formula. For such a property, a maximally permissive supervisor( M P S ) can be synthesized. Formally, logics LTL and QDDC have incomparableexpressive power. There is increasing evidence that regular properties form animportant class of requirements [9,18,19]. The IEEE standard PSL extends LTLwith regular properties [1]. Wonham and Ramadge in their seminal work [25,26]ﬁrst studied the synthesis of maximally permissive supervisors from regular prop-erties. In their supervisory control theory,

M P S can in fact be synthesized for aricher property class

AGEF D h [10]. Tool DCSynth can be easily extended tosupport such properties too. Riedweg et al [28] give some sub-classes of Quan-tiﬁed Mu-Calculus for which M P S can be computed. However, none of theseworks address soft requirement guided synthesis.Most of the reactive synthesis tools focus on correct-by-construction synthe-sis from hard requirements. For example, none of the tools in recent competitionon reactive synthesis, SYNTCOMP17 [13], address the issue of guided synthesiswhich is our main focus. In our approach, we reﬁne the MPS (for hard require-ments) to a sub-supervisor optimally satisfying the soft requirements too. SinceLTL does not admit MPS, it is unclear how our approach can extend to it.In quantitative synthesis, a weighted arena is assumed to be available, andalgorithms for optimal controller synthesis for diverse objectives such as Mean-payoﬀ [4] or energy [6] have been investigated. In our case, we ﬁrst synthesizethe weighted arena from given hard and soft requirements. Moreover, we use H -optimality as the synthesis criterion. This criterion has been widely usedin reinforcement learning as well as optimal control of MDPs [2, 24]. In otherrelated work, techniques for optimal controller synthesis are discussed by Ding et al [9], Wongpiromsarn et al [32] and Raman et al [27], where they have exploredthe use of receding horizon model predictive control along with temporal logicproperties.Since our focus is on the quality of the controllers, we have also deﬁned met-rics and measurement techniques for comparing the controllers for their guar-anteed (based on must dominance) and expected case performance. For the ex-pected case measurement, we have assumed that inputs are iid . However, themethod can easily accommodate a ﬁnite state Markov model governing the oc-currences of inputs.DCSynth uses an eﬃcient BDD-based symbolic representation, inheritedfrom tool MONA [15] for storing automata, supervisors and controllers. Theuse of eager minimization (see Appendix D for implementation details) allowsus to handle much more complex properties (see Appendix E). We have presented a technique for guided synthesis of controllers from hard andsoft requirements speciﬁed in logic QDDC. This technique is also implemented inour tool DCSynth. Case studies show that combination of hard and soft require-ments provides us with a capability to deal with unrealizable (but desirable), con-ﬂicting and default requirements. In context of assumption-commitment basedpeciﬁcation, we have shown with case studies that soft requirements improvethe expected case performance, where as hard requirements provide certain (buttypically conditional) guarantees on the synthesized controller. Hence, the com-bination of hard and soft requirements as formulated in

T ype expectedvalue and must dominance metrics. This helps us in designing better performingcontrollers.

References

1. Iec 62531:2012(e) (ieee std 1850-2010): Standard for property speciﬁcation lan-guage (psl).

IEC 62531:2012(E) (IEEE Std 1850-2010) , pages 1–184, June 2012.2. R. E. Bellman.

Dynamic Programming . Princeton Univ. Press, 1957.3. R. Bloem, K. Chatterjee, K. Greimel, T. A. Henzinger, G. Hoﬀerek, B. Jobstmann,B. K¨onighofer, and R. K¨onighofer. Synthesizing robust systems.

Acta Inf. , 51(3-4):193–220, 2014.4. R. Bloem, K. Chatterjee, T. A. Henzinger, and B. Jobstmann. Better quality insynthesis through quantitative objectives. In

CAV , pages 140–156, 2009.5. A. Bohy, V. Bruy`ere, E. Filiot, N. Jin, and J. Raskin. Acacia+, a tool for LTLsynthesis. In

CAV , pages 652–657, 2012.6. Patricia Bouyer, Nicolas Markey, Mickael Randour, Kim G. Larsen, and SimonLaursen. Average-energy games.

Acta Informatica , 55(2):91–127, Mar 2018.7. Gaurav Chakravorty and Paritosh K. Pandya. Digitizing interval duration logic. In

Computer Aided Veriﬁcation, 15th International Conference, CAV 2003, Boulder,CO, USA, July 8-12, 2003, Proceedings , pages 167–179, 2003.8. Z. Chaochen, C. A. R. Hoare, and A. P. Ravn. A calculus of durations.

Inf. Process.Lett. , 40(5):269–276, 1991.9. Xu Chu Ding, Mircea Lazar, and Calin Belta. LTL receding horizon control forﬁnite deterministic systems.

Automatica , 50(2):399–408, 2014.10. R. Ehlers, S. Lafortune, S. Tripakis, and M. Y. Vardi. Supervisory control andreactive synthesis: a comparative introduction.

Discrete Event Dynamic Systems ,27(2):209–260, Jun 2017.11. P. Faymonville, B. Finkbeiner, and L. Tentrup. Bosy: An experimentation frame-work for bounded synthesis. In

CAV , pages 325–332, 2017.12. Erich Gr¨adel, Wolfgang Thomas, and Thomas Wilke, editors.

Automata Logics,and Inﬁnite Games: A Guide to Current Research . Springer-Verlag New York,Inc., New York, NY, USA, 2002.13. S. Jacobs and et. al. The 4th reactive synthesis competition (SYNTCOMP 2017):Benchmarks, participants & results.

CoRR , abs/1711.11439, 2017.14. J. Katoen, I. S. Zapreev, E. M. Hahn, H. Hermanns, and D. N. Jansen. The insand outs of the probabilistic model checker mrmc.

Performance Evaluation , 2011.Advances in Quantitative Evaluation of Systems.15. N. Klarlund and A. Møller.

MONA Version 1.4 User Manual , 2001. Notes SeriesNS-01-1. Available from . Revision of BRICS NS-98-3.16. N. Klarlund, A. Møller, and M. I. Schwartzbach. MONA implementation secrets.

International Journal of Foundations of Computer Science , 13(04):571–586, 2002.7. Shankara Narayanan Krishna and Paritosh K. Pandya. Modal strength reductionin quantiﬁed discrete duration calculus. In

FSTTCS 2005: Foundations of SoftwareTechnology and Theoretical Computer Science, 25th International Conference, Hy-derabad, India, December 15-18, 2005, Proceedings , pages 444–456, 2005.18. S. Lafortune, K. Rudie, and S. Tripakis. 30 years of the ramadge-wonham theoryof supervisory control: A retrospective and future perspectives.

DCD Workshop ,2017.19. R. M. Matteplackel, P. K. Pandya, and A. Wakankar. Formalizing timing diagramrequirements in discrete duration calulus. In

SEFM , pages 253–268, 2017.20. P. K. Pandya. Model checking CTL*[DC]. In

TACAS , pages 559–573, 2001.21. P. K. Pandya. Specifying and deciding quantiﬁed discrete-time duration calculusformulae using dcvalid. In

RTTOOLS (aﬃliated with CONCUR 2001) , 2001.22. P. K. Pandya. The saga of synchronous bus arbiter: On model checking quantitativetiming properties of synchronous programs.

Electr. Notes Theor. Comput. Sci. ,65(5):110–124, 2002.23. P. K. Pandya. Finding extremal models of discrete duration calculus formulaeusing symbolic search.

Elect. Notes Theor. Comp. Sc. , 128(6):247–262, 2005.24. Martin L. Puterman.

Markov Decision Processes: Discrete Stochastic DynamicProgramming . John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.25. P. Ramadge and W. Wonham. Supervisory control of a class of discrete eventprocesses.

SIAM Journal on Control and Optimization , 25(1):206–230, 1987.26. Peter J. G. Ramadge and W. Murray Wonham. The Control of Discrete EventSystems. In

In Proceedings of IEEE , volume 77, pages 81–98, 1989.27. Vasumathi Raman, Alexandre Donz´e, Dorsa Sadigh, Richard M. Murray, and San-jit A. Seshia. Reactive synthesis from signal temporal logic speciﬁcations. In

Proceedings of the 18th International Conference on Hybrid Systems: Computationand Control , HSCC ’15, pages 239–248, New York, NY, USA, 2015. ACM.28. St´ephane Riedweg and Sophie Pinchinat. Quantiﬁed mu-calculus for control syn-thesis. In

Mathematical Foundations of Computer Science 2003, 28th InternationalSymposium, MFCS 2003, Bratislava, Slovakia, August 25-29, 2003, Proceedings ,pages 642–651, 2003.29. Babita Sharma, Paritosh K. Pandya, and Supratik Chakraborty. Bounded validitychecking of interval duration logic. In

Tools and Algorithms for the Construc-tion and Analysis of Systems, 11th International Conference, TACAS 2005, Heldas Part of the Joint European Conferences on Theory and Practice of Software,ETAPS 2005, Edinburgh, UK, April 4-8, 2005, Proceedings , pages 301–316, 2005.30. A. Wakankar, P. K. Pandya, and R. M. Matteplackel. DCSYNTH: guided reactivesynthesis with soft requirements for robust controller and shield synthesis.

CoRR ,abs/1711.01823, 2017.31. A. Wakankar, P. K. Pandya, and R. M. Matteplackel.

DCSynth 1.0 . TIFR, Mum-bai, 2018. .32. Tichakorn Wongpiromsarn, Ufuk Topcu, and Richard M. Murray. Receding horizontemporal logic planning.

IEEE Trans. Automat. Contr. , 57(11):2817–2830, 2012.33. Chaochen Zhou and Michael Hansen.

Duration Calculus A Formal Approach toReal-Time Systems . Springer, 2004.

Other Case Studies

In this section we present 3 more case studies:1. n -client shared resource arbiter (several diﬀerent speciﬁcations),2. alarm annunciation system. A.1 Arbiter An n -client resource (e.g. bus) arbiter is a circuit with r , . . . , r n as inputs ( r i highindicates i th client request access to resource) and ack , . . . , ack n ( a i indicatesarbiter has granted the i th client access to the resource) as the correspondingoutputs. Arbiter arbitrates among a subset of requests at each cycle by settingone of the acknowledgments ( a i ’s) true . Hard requirements on the arbiter includethe following three invariant properties. M utex ( n ) = true ˆ (cid:104) ∧ i (cid:54) = j ¬ ( a i ∧ a j ) (cid:105) , ≤ i ≤ nN oLoss ( n ) = true ˆ (cid:104) ( ∨ i r i ) ⇒ ( ∨ j a j ) (cid:105) , ≤ i ≤ nN oSpurious ( n ) = true ˆ (cid:104) ∧ i ( a i ⇒ r i ) (cid:105) , ≤ i ≤ nARBIN V ( n ) = M utex ( n ) ∧ N oLoss ( n ) ∧ N oSpurious ( n ) . (1)Thus, M utex gives mutual exclusion of acknowledgments,

N oLoss states thatif there is at least one request then there must be an acknowledgment and

N ospurious states that acknowledgment is only given to a requesting cell.In the literature various arbitration schemes for the arbiter have been pro-posed, here we consider the following schemes. – k -cycle response time: let Resp ( r, a, k ) denote that if request has been highfor last k cycles there must have been at least one acknowledgment in the last k cycles. Let ArbResp ( n, k ) state that for each cell i and for all observationintervals the formula Resp ( r i , a i , k ) holds. Resp ( r, a, k ) = true ˆ(([[ r ]] && ( slen = ( k − ⇒ true ˆ( scount a > slen = ( k − ArbResp ( n, k ) = ( ∧ ≤ i ≤ n ( Resp ( r i , a i , k )) ArbCommit ( n, k ) = ARBIN V ( n ) ∧ ArbResp ( n, k ) (2)Based on k -cycle response we can deﬁne various arbiter speciﬁcation withdiﬀerent properties as follows: • Then speciﬁcation

Arb hard ( n, k ) is the following k -cycle response timeDCSynth speciﬁcation. Arb hard ( n, k ) = ( { r , . . . , r n } , { a , . . . , a n } , ArbCommit ( n, k ) , (cid:104)(cid:105) ) (3) • The speciﬁcation

Arb hard above, invariantly satisfy

ArbCommit ( n, k ) if n ≤ k . Tool DCSynth gives us a concrete controller for the instance( D h = ArbCommit (6 , D s = true ). It is easy to see that there iso controller which can invariantly satisfy ArbCommit ( n, k ) if k < n .Consider the case when all requests r i are continuously true. Then, it isnot possible to give response to every cell in less than n cycles due tomutual exclusion of acknowledgment a i .To handle such desirable but unrealizable requirement we make an as-sumption. Let the proposition Atmost ( n, i ) be deﬁned as ∀ S ⊆ { . . . n } , | S | ≤ i. ∧ j / ∈ S ¬ r j .It states that at most i requests are true simultaneously. Then, the ar-biter assumption is the formula ArbAssume ( n, i ) = [[ Atmost(n,i)]] ,which states that Atmost ( n, i ) holds invariantly in past.The synchronous arbiter speciﬁcation Arb ( n, k, i ) is the assumption-commitment pair ( ArbAssume ( n, i ) , ArbCommit ( n, k )). The four typesof controller speciﬁcations can be derived from this pair. Figure 4 in Ap-pendix C gives, in textual syntax of the speciﬁcation for T ype Arb (5 , , T ype Arb ( n, k, i )) isdenoted by Arb hardAssume given as follows:

Arb hardAssume ( n, k, i ) = ( { r , . . . , r n } , { a , . . . , a n } , pref ( ArbAssume ( n, i )) ⇒ EP ( ArbCommit ( n, k )) , (cid:104) ArbCommit ( n, k ) (cid:105) ) (4) • k -cycle response time as soft requirement: we specify the requirement ofresponse in k cycles as a soft requirement as below. Arb soft ( n, k ) = ( { r , . . . , r n } , { a , . . . , a n } , ARBINV ( n ) , (cid:104) Resp ( r n , a n , k ) : 2 n , . . . , Resp ( r , a , k ) : 2 (cid:105) ) (5) – Token ring arbitration: a token is circulated among the masters in a roundrobin fashion. The token is modeled using the variables tok i ’s (1 ≤ i ≤ n ). Exactly one of tok i ’s will hold at any time and if tok i is true then wemean that master i holds the token. The arbiter asserts acknowledgement a i whenever request r i and tok i are true , i. e. priority is accorded to the requestof the master which holds the token. T okInit ( n ) = < tok && ( ∧ ≤ i ≤ n ! tok i ) > ˆ trueT okCirculate ( n ) = []( ∧ ≤ i ≤ n ( tok i ˆ ( slen = 1) < = > ( slen = 1) ˆ tok i % n +1 )) T okResp ( n ) = ∧ ≤ i ≤ n [[( r i && tok i ) = > a i ]] T oken ( n ) = T okInit ( n ) ∧ T okCirculate ( n ) ∧ T okResp ( n ) . (6)Let ARBT OKEN ( n ) = ARBIN V ( n ) ∧ T oken ( n ). Then Arb tok ( n ) is thefollowing DCSynth speciﬁcation. Arb tok ( n ) = ( { r , . . . , r n } , { a , . . . , a n } , ARBT OKEN ( n ) , (cid:104)(cid:105) , (cid:104)(cid:105) ) (7) Note that soft requirement in this example is a lexicographical list of several QDDCformulas. The tool DCSynth implements a MPHOS computation using weightedrequirements (See D.2 for details) .2 Alarm Annunciation System

The next case study is Alarm Annunciation System(AAS) used in a processcontrol system for annunciation for various alarms in the control room. TheAlarm Annunciation involves the standard

Automatic Ring-Back Sequence forall the digital inputs meant for alarm annunciation and provide the necessaryoutputs. The speciﬁcation of Automatic Ring-Back Sequence is given in Table2. All digital inputs representing alarm conditions are scanned periodically.

Table 2.

Automatic Ring-Back Sequence for Alarm Annunciation

Input Lamp Output Audio Output

Normal to Alarm Fast Flashing Normal Alarm Hooter OnAcknowledged Lamp On Normal Alarm Hooter OﬀAlarm to Normal Slow Flashing Ringback Hooter OnReset Lamp Oﬀ Ringback Hooter Oﬀ

As shown in the Table 2 that Automatic Ring-Back Sequence speciﬁcationtakes the alarm signal as input. The high value of signal represents the alarmstate, otherwise the signal is said to be in normal state. Other inputs are Ac-knowledgment, Reset and Silence inputs, which are controller by the operator.There are three output elements: Lamp, Normal Hooter and Ringback Hooter.There is a Lamp corresponding to each alarm signal, whereas Hooters are com-mon to all alarm signals. Lamp can either be

Fast Flashing, Slow Flashing,Steady On or Oﬀ states. We have encoded the requirements in DCSynth to syn-thesize the controller. The Silence input can be used by the operator to switchoﬀ Hooters.

Result of Synthesis : As discussed in the previous case study soft requirementshelped in speciﬁcation of requirements concisely. The controller synthesized has8 states. We could simulate the controller and veriﬁed the correctness.

A.3 Synthesis of 2 client arbiter

Fig. 1 gives the monitor automaton for 2-client arbiter (See section A.1) for n-cell arbiter speciﬁcation. Each transition is labeled by 4 bit vector giving valuesof r , r , a , a .Fig. 2 gives the MPS automaton for the 2-cell arbiter computed from thesafety monitor automaton of Fig. 1. (There is an additional reject state. Allmissing transitions are directed to it. These are omitted from the diagram forsimplicity.) Note that this is a DFA whose transitions are labelled by 4-bit vectorsrepresenting alphabet 2 { r ,r ,a ,a } . As deﬁned in Deﬁnition 2, the DFA alsodenotes an output-nondeterministic Mealy machine with input variables ( r , r ) ig. 1. Safety Monitor Automaton: 2 Cell Arbiter

Fig. 2.

Supervisors for

Arb hard (2 ,

2) (a): MPS (b): MPHOS determinized and output variables ( a , a ). The automaton is nondeterministic in output asfrom state 1, on input (1 ,

1) it can move to state 2 with output (1 , , (cid:104) ack , ack (cid:105) which give ack ack

2, we obtain the MPHOS controller automaton of Fig. 2(b) fromthe MPS of Fig. 2(a). Note that we minimize the automaton at each step.

B Speciﬁcation of Mine-pump and Arbiter example inDCSynth

The speciﬁcation of Arbiter(5,3,2) and MinePump(8,2,6,2) is DCSynth syntaxis given is Figure 3 and 4 respectively.

DCSynth Tool Usage and performance evaluation

The tool DCSynth uses a speciﬁcation for Arbiter

Arbiter.qsf shown in Figure4. This ﬁle contains the set of input and output alphabets in interface section.The deﬁnitions/macros required for specifying hard and soft requirements arecontained in deﬁnitions section. This is followed by a section called indeﬁnitions ,to specify the required indicating monitor for a given formula (or correspondingautomaton). Finally the section called hardreq and softreq deﬁne the hard andsoft requirements respectively using the deﬁnitions and indicating monitors. Thesteps to synthesize a controller from the speciﬁcation ﬁle is as follows. – First we generate the DFAs for D h , D s and the required input/output par-titioning ﬁle using qsf command, e.g. for Arbiter example, we use qsf Ar- Fig. 3.

Mine-pump speciﬁcation in DCSynth { input HH2Op, HCH4p;output PUMPONp monitor x, ga monitor x;constant w = 8, epsilon=2 , zeta=6, kappa=2; } deﬁnitions { //Methane release assumptionsdc methane1(HCH4) { []([HCH4]ˆ[!HCH4]ˆ < HCH4 > = > slen > zeta ); } dc methane2(HCH4) { []([[HCH4]] = > slen < kappa ); } //Pump capacity assumptiondc pumpcap1(HH2O, PUMPON) { ([]!(slen=epsilon && ([[PUMPON && HH2O]] ˆ < HH2O > ))); } dc MineAssume 2 6 2(HH2O, HCH4, PUMPON) { methane1(HH2O, HCH4, PUMPON) && methane2(HH2O, HCH4, PUMPON) &&pumpcap1(HH2O, HCH4, PUMPON); } //safety conditiondc req1(HH2O, HCH4, PUMPON) { trueˆ < ( (HCH4 || !HH2O) = > !PUMPON) > ; } dc req2(HH2O, HCH4, PUMPON) { (!(true ˆ([[HH2O]] && (slen = w)))) } dc MineCommit 8(HH2O, HCH4, PUMPON) { req1(HH2O, HCH4, PUMPON) && req2(HH2O, HCH4, PUMPON); } indeﬁnitions { ga : MineCommit 8(HH2Op, HCH4p, PUMPONp); } hardreq { MineAssume 2 6 2(HH2Op, HCH4p, PUMPONp) = > MineCommit 8(HH2Op, HCH4p, PUMPONp); } softreq { useind ga;(ga); } iter.qsf to generate ﬁles named Arbiter.hardreq.dfa , Arbiter.softreq.dfa and

Arbiter.io as per step 1 of synthesis method in section 4. – We then use the command synth2 Arbiter.hardreq.dfa Arbiter.softreq.dfaArbiter.io synth.conﬁg to synthesize the supervisors as per step 2 and 3

Fig. 4.

Arbiter speciﬁcation in DCSynth { input r1, r2, r3, r4, r5;output a1, a2, a3, a4, a5, ga3;constant n=3; } deﬁnitions { // Speciﬁcation 1: The Acknowlegments shold be exclusivedc exclusion() { trueˆ < (a1 = > !(a2 || a3 || a4 || a5)) && (a2 = > !(a1 || a3 || a4 || a5)) &&(a3 = > !(a1 || a2 || a4 || a5)) && (a4= > !(a1 || a2 || a3 || a5)) && (a5= > !(a1 || a2 || a3 || a4))) > ; } dc noloss() { trueˆ < (r1 || r2 || r3 || r4 || r5) = > (a1 || a2 || a3 || a4 || a5) > ; } //If bus access (ack) should be granted only if there is a requestdc nospuriousack(a1, r1) { trueˆ < (a1) = > (r1) > ; } //n cycle response i.e. slen=n-1dc response(r1,a1) { trueˆ(slen=n-1 && [[r1]]) = > trueˆ(slen=n-1 && (scount a1 > = 1)); } dc ArbAssume 5 2() { [[ (!r1 && !r2 && !r3 && !r4 && !r5) || (r1 && !r2 && !r3 && !r4 && !r5) || (!r1 && r2 && !r3 && !r4 && !r5) || (!r1 && !r2 && r3 && !r4 && !r5) || (!r1 && !r2 && !r3 && r4 && !r5) || (!r1 && !r2 && !r3 && !r4 && r5) || (r1 && r2 && !r3 && !r4 && !r5) || (r1 && !r2 && r3 && !r4 && !r5) || (r1 && !r2 && !r3 && r4 && !r5) || (r1 && !r2 && !r3 && !r4 && r5) || (!r1 && r2 && r3 && !r4 && !r5) || (!r1 && r2 && !r3 && r4 && !r5) || (!r1 && r2 && !r3 && !r4 && r5) || (!r1 && !r2 && r3 && r4 && !r5) || (!r1 && !r2 && r3 && !r4 && r5) || (!r1 && !r2 && !r3 && r4 && r5) ]]; } dc guranteeInv() { exclusion() && noloss(HH2O, HCH4, PUMPON) && nospuriousack(a1,r1) &&nospuriousack(a2,r2) && nospuriousack(a3,r3) && nospuriousack(a4,r4) &&nospuriousack(a5,r5); } dc guranteeResp() { response(r1,a1) && response(r2,a2) && response(r3,a3) &&response(r4,a4) && response(r5,a5); } dc ArbCommit 5 3() { guranteeInv() && guranteeResp() ; }} indeﬁnitions { ga3 : ArbCommit 5 3(); } hardreq { ArbAssume 5 2() = > ArbCommit 5 3(); } softreq { useind ga3;(ga3); } f synthesis method in section 4. The ﬁle synth.conﬁg is used to provide theconﬁguration parameters like the number of iterations for H-optimal super-visor. The command produces the supervisors MPS.dfa and

MPHOS.dfa . – We then determinize the

MPHOS.dfa using default values with command synth deterministic MPHOS.dfa default.io to get a controller called

Controller.dfa . The ﬁle default.io contains the ordered list of output lit-erals e.g. if we have two outputs o1 and o2, then the list { !o1,o2 } says thattry to determinize the MPHOS.dfa with following priority for the output { !o1,o2 } >> { !o1,!o2 } >> { o1,o2 } >> { o1,!o2 } . – To measure the expected value of the soft requirement being satisﬁed, thecommand aut2mrmc Controller.dfa default.io index is used. The in-dex parameter is the index of indicator variable (for the soft requirement)in the

Controller.dfa . The command produces

Controller.tra and

Con-troller.lab ﬁles which can be imported in tool MRMC to compute the ex-pected value. – Apart from expected case performance, the tool also facilitates the methodfor checking must dominance between two given supervisors S and S . Assupervisors are ﬁnite state mealy machines and a commitment C is a regularproperty, we can use validity checking of QDDC to check whether S ≤ Cdom S as formulated in the following proposition. In tool DCSynth, we provide afacility to decide must-dominance between two supervisors. The tool alsogives a counter example if must-dominance fails. Proposition 4.

Let D ( S i ) denote QDDC formulas with same language asthe supervisors S i and let C be regular property over input-output alphabet ( I, O ) . Then, S ≤ Cdom S iﬀ | = qddc ∀ I ([ ∀ O. D ( S ) ∧ C )] ⇒ [ ∀ O. ( D ( S ) ∧ C )])We use this facility to compare supervisors obtained from diﬀerent types ofspeciﬁcations discussed in Section 5.1 D Synthesis with Semi-Symbolic DFA

An interesting representation for total and deterministic ﬁnite state automatawas introduced and implemented by Klarlund et al in the tool MONA [15]. Itwas used to eﬃciently compute formula automaton for MSO over ﬁnite words.We denote this representation as

Semi-Symbolic DFA (SSDFA). In this repre-sentation, the transition function is encoded as multi-terminal BDD (MTBDD).The reader may refer to original papers [15, 16] for further details of MTBDDand the MONA DFA library.Here, we brieﬂy describe the SSDFA representation, and then consider con-troller synthesis on SSDFA. Figure 5(a) gives an explicit DFA. Its alphabet Σ is 4-bit vectors giving value of propositions ( r , r , a , a ) and set of states S = { , , , } . This automaton has a unique reject state 4 and all the miss-ing transitions are directed to it. (State 4 and transitions to it are omitted inFigure 5(a) for brevity.) ig. 5. A mps for Arb hard (2 ,

2) (a): External format (b): SSDFA format

Figure 5(b) gives the SSDFA for the above automaton. Note that states areexplicitly listed in the array at top and ﬁnal states are marked as 1 and non-ﬁnal states marked as −

1. (For technical reasons there is an additional state0 which may be ignored here and state 1 may be treated as the initial state).Each state s points to shared MTBDD node encoding the transition function δ ( s ) : Σ → S with each path ending in the next state. Each circular nodeof MTBDD represents a decision node with indices 0 , , , r , r , a , a . Solid edges lead to true co-factors and dotted edges to false co-factors.MONA provides a DFA library implementing automata operations includ-ing product, complement, projection and minimization on SSDFA. Moreover,automata may be constructed from scratch by giving list of states and addingtransitions one at a time. A default transition must be given to make the automa-ton total. Tools MONA and DCVALID use eager minimization while convertingformula into SSDFA. Remark 1 : DFA in Figure 5 also denotes a Output-nondeterministic Mealymachine with input alphabet ( r , r ) and output alphabet ( a , a ). Automatonis nondeterministic in its output as δ (1 , (1 , , , δ (1 , (1 , , , I, O, D h , D s ), without actually expanding the speciﬁcationautomata into game graph. The use of SSDFA leads to signiﬁcant improvementin the scalability and computation time of the tool. D.1 Computing Maximally Permissive Supervisor (MPS)

Recall the synthesis method in Section 4. Let the hard requirement automatonbe A ( D h ) = (cid:104) S, I ∪ O , δ, F (cid:105) . We construct the maximally permissive supervisorby iteratively applying Cpre ( A ( D h ) , X ) to compute set of winning states G , asoutlined section 4.1. This requires eﬃcient implementation of Cpre ( A ( D h ) , X )over SSDFA A ( D h ). The symbolic algorithm for Cpre marks, (a) each leaf noderepresenting state s by truth value of s ∈ X , (b) each decision node associatedwith an input variable with AN D of its children’s value, and (c) each deci-sion node associated with output variable with OR of its children’s value. Theomputation is carried out bottom up on MTBDD and takes time | M T BDD | ,where | M T BDD | is the number of BDD nodes in it. In contrast the enumera-tive method for implementation of Cpre would have taken time of the order of2 | I ∪ O | .Next we compute the automaton M P S ( D h ) = (cid:104) G ∪ { r } , I ∪ O , G, δ (cid:48) (cid:105) by onlyretaining transitions between the winning states G . Here r is the unique rejectstate introduced to make the automaton total. We consider the following twomethods. – Enumerative method:

M P S ( D h ) is constructed from A ( D h ) by adding atransition at a time as follows: for any s ∈ G if δ ( s, ( i, o )) ∈ G then( s, ( i, o ) , δ ( s, ( i, o )) ∈ δ (cid:48) . Clearly, this algorithm has time complexity | S | × | I ∪ O | . Finally, we make A mps total by adding all the unaccounted transitionsfrom any state to the reject state r . – Symbolic method: in this method, the MTBDD of A ( D h ) is modiﬁed so thateach edge pointing to a state in S − G is changed to go to the reject state r .Note that this makes states in S − ( G ∪ { r } ) inaccessible. Now this modiﬁedSSDFA is minimized to get rid of inaccessible states and to get smaller MPS.The time complexity of this computation is O ( | M T BDD | ) for modifying thelinks and N.t.log ( N ) for minimization where N is number of states and t isthe size of alphabet in A ( D h ).In Table 3 we give experimental results comparing the computation of

M P S ( D h )using the two algorithms. It can be seen that the symbolic algorithm can be fasterby several orders of magnitude. This is because we do not construct the MPSfrom scratch; instead we only redirect some links in MTBDD of A ( D h ) which isalready computed. The Mine-pump speciﬁcation used in the Table 3 is given inSection 5.3. D.2 Computing H -Optimal Supervisor (MPHOS) In this step we compute the MPHOS from MPS. For a given maximally permis-sive supervisor MPS, a QDDC formula D and an integer parameter H. We getthe H-optimal sub-supervisor of MPS called MPHOS by iteratively computing V al ( s, p + 1) from V al ( s, p ) for 0 ≤ p < H as outlined in Section 4.2 . This stepcan be denoted by VALPre ( A Arena ). Remark 2 : For A Arena a transition has the form δ ( s, ( i, o, v )) with i ∈ I , o ∈ O , v ∈ { w } . However, from the deﬁnition of Ind ( D s , w ), the value of w isuniquely determined by ( s, ( i, o )) in the corresponding automaton A ( Ind ( D s , w )).Hence we can abbreviate the transition as δ ( s, ( i, o )). (cid:117)(cid:116) Note that the tool DCSynth in general allows a lexicographical list of soft require-ment, it is basically a lexicographical list of several QDDC formulas. The tool DC-Synth implements a MPHOS computation based on this lexicographical (or withexplicit weight to each soft requirement) list by using the weight for each transitionas the sum of weights of all the soft requirement being satisﬁed on that transition.The tool also allows the discounting factor γ which is used to give higher weight tothe requirements being satisﬁed in near future able 3. MPS Synthesis: Enumeration vs symbolic method (time in seconds). For A ( D h ) we give number of states and time to compute it from the QDDC hard require-ment formula. For

MP S ( D h ) we give its number of states and time to compute it usingthe two methods. S t , T s , En and Sy represent total no. of states, time in seconds, enu-merative method and symbolic method respectively. The Example Arb hard ( n, k ) rep-resents the speciﬁcation T ype Arb ( n, k, n )) and Arb soft ( n, k ) represents the example T ype Arb ( n, k, n )).Example Hard Requirement A ( D h ) MP S ( D h ) S t T s S t T s T s ( En ) ( Sm ) Arb hard (4 , ARBHARD (4 ,

4) 177 0.04 126 0.025563 0.002033

Arb hard (5 , ARBHARD (5 ,

5) 2103 0.43 1297 0.59 0.04

Arb hard (6 , ARBHARD (6 ,

6) 31033 9.22 16808 42.75 0.91

Arb soft (4 , ARBINV (4) 3 0.016 2 6.2E-4 7.6E-5

Arb soft (5 , ARBINV (5) 3 0.020 2 1.9E-3 1.2E-4

Minepump (8 , , , MineAssume (2 , ,

1) 271 0.08 211 1.4E-2 5.8E-3 ⇒ ArbCommit (8)

Now to compute

M P HOS we again have two methods: one is enumerativeand other is symbolic method. We give the algorithm and associated complexityresults for one value iteration (i.e. for

VALpre followed by O max computation.Let Q be the set of states of A Arena . – Enumerative Method : As given in Step (3) of synthesis method, for eachstate s we need to enumerate all paths starting from s to get V al ( s, p + 1)from V al ( s, p ), which will take time of the order of 2 | I ∪ O | × k , where k is thenumber of soft requirements (In this paper k is assumed to be 1). Similarcomplexity will be required to get the list of transitions with maximumvalues denoted as o max (Note that there can be multiple transitions withsame o m ax , all such transitions will be included in MPHOS). Hence, As thealgorithm terminates after H iterations the total time complexity of entirealgorithm for H iteration is | Q | × | I ∪ O | × H (for k = 1). – Symbolic method:

For this optimization to be applicable we assume that inMTBDD representation of A Arena , all the input variables occur before theoutput variables O and the indicating variable w (in general it can be a setif k > frontier node if it is labelled withan output or a witness variable, and all its ancestors are labelled with in-put variables. For example, in Figure 5(b), these are nodes labelled 2 (theyhappen to occur at same level in this example). For each frontier node enu-merate each path π within the MTBDD below the frontier node (this ﬁxesvalues of ( o, w ) ∈ O × W occurring on π as well as next state s (cid:48) ). Updatethe optimal o max as well as next state s o based on wt ( o, v ) for paths seen sofar. This takes time O ( d f × k ) where d f is the number of paths in M T BDD below the frontier node f . This optimal output o max ( f ) as well as next state o for each value iteration is stored in each frontier node f . The total timetaken is O ( d output × H ) where d output = Σ f ∈ F r d f and F r is the set of allfrontier nodes. H is the number of steps in value iterations.In second step, for each state s ∈ Q , enumerate each path from state s toa frontier node f . This ﬁxes the valuation of input x . Insert a transition δ mphos ( s, ( x, o max )) = s o to A mphos . Let the total number of paths up tofrontier nodes be d input . Then the second step takes time O ( d input + | Q | )where time taken to insert a transition in A mphos is assumed to be constant.Hence total time for entire algorithm is A mphos is O ( d + | Q | ) × H where d is total number of paths in MTBDD of A Arena (here k is assumed to be 1).It may also be noted that in worst case, the total number of MTBDD paths d is of size O (2 | I ∪ O | ) and two algorithms have comparable complexity. But inmost cases, the total number of MTBDD paths d (cid:28) | I ∪ O | and the symbolicalgorithm turns out to be more eﬃcient. D.3 Computing a Controller using default value

The controller can be computed from MPHOS for a given default value order ord ,using the similar algorithm as given for MPHOS computation. Here we assumethat the default values provided are the soft requirements and H is equal to 1.So the MPHOS computation algorithm will try to choose those transitions withoutputs that locally satisfy the default values given in ord . As the ord providesdefault values for every output variable, so there will always be a unique outputthat will maximally satisfy the default value. Hence, the output will always bea deterministic Mealy machine i.e., we will get a controller. E Comparison with Other tools

In Table 4 we have compared the performance of DCSynth with few leading toolsfor LTL synthesis. The examples in QDDC are manually translated into boundedLTL properties for giving them as input to Acacia+ [5] and BoSy [11]. We haveonly considered examples with hard requirements as these tools do not support soft requirements . The on-line version of BoSy tool was used which enforces amaximum timeout of 600 seconds. For other tools, a local installation on Linux(Ubuntu 16.04) system with Intel i5 64 bit, 2.5 GHz processor and 4 GB memorywas used with a time out of 3600 seconds. In this comparison DCSynth was usedwith symbolic algorithm for both MPS and MPHOS computation. Note that forthese examples the MPHOS algorithm will always terminate after 1 iterationonly, as the examples do not have soft requirements, so DCSynth chooses oneof the possible outputs from the MPS based on default output order. We haveprovided default output order for all types of arbiter example as a > . . . > a i and for Mine-pump example it is P umpOn .As the comparison table above shows, the DCSynth approach seems to out-perform the state-of-the-art tools in scalability and controller computation time. able 4.

Comparison of Synthesis in Acacia+, BoSy and DCSynth, in terms of con-troller computation time and memory and number of states of the controller automaton.

Minepump as well as

Arb tok ( n ) speciﬁcations can be found in Appendix A.1.Acacia+ BoSy DCSynthHard Requirement time(Sec) Memory / time(Sec) Memory / time(Sec) Memory /States States States Arb hard (4 ,

4) 0.4 29.8/ 55 0.75 -/4 0.08 9.1/ 50

Arb hard (5 ,

5) 11.4 71.9/ 293 14.5 -/8 5.03 28.1/ 432

Arb hard (6 ,

6) TO a - TO - 80 1053.0/ 4802 Arb tok (7) 9.65 39.1/ 57 TO - 0.3 7.3/ 7

Arb tok (8) 46.44 77.9/ 73 - - 2.2 16.2/ 8

Arb tok (10) NC b - - - 152 82.0/ 10Mine-pump NC - TO - 0.06 50/ 32Experiments with BoSy are using online version. a TO=timeout(DCSynth and Acacia+ 3600secs, BoSy 600secs) b NC=synthesis inconclusive

This is largely due to the pragmatic design choices made in the logic QDDC andtool DCSynth.It can also be seen that BoSy often results in controller with fewer states.BoSy is speciﬁcally optimized to resolve non-determinism to get fewer states. Inour case, the tool is optimized to satisfy maximal number of soft requirements.It would be interesting to merge the two techniques for best results.

F Measuring latency using Model Checking

Plethora of synthesis algorithms and optimizations give rise to diverse controllersfor the same requirement. In comparing the quality of these diﬀerent controllers,an important measure is their worst case latency . Latency can be deﬁned astime (number of steps) taken to achieve some desired behaviour. In our frame-work, for latency speciﬁcation, user must give a QDDC formula D p charac-terizing execution fragments of interest. For example the QDDC formula D p = [[req && !ack]] speciﬁes fragments of execution with request continuously true but with no acknowledgment. Given a DFA (controller) M , the latencygoal M AXLEN ( D p , M ) computes sup { e − b | ρ, [ b, e ] | = D p , ρ ∈ Exec ( M ) } ,i. e. it computes the length of the longest interval satisfying D p across all theexecutions of M . Thus, it computes worst case latency for achieving behaviour D p in M . For example, given a synchronous bus arbiter controller Arb , goal

M AXLEN ( [[req && !ack]] , Arb ) speciﬁes the worst case response time ofthe arbiter Arb . Tool CTLDC, which like DCSynth and DCVALID is memberof DCTOOLS suite of tools, provides eﬃcient computation of

M AXLEN bysymbolic search for longest paths as formulated in article [23]. This facility willbe used subsequently in the paper to compare the worst case response timesachieved by various controllers synthesized under diﬀerent criteria. able 5.

Worst Case Response Time Analysis using CTLDC using Response FormulaMAXLEN([[ req i && ! ack i ]]) computation. The value of H is speciﬁed only for Arb soft

Sr.No Arbiter Variant Horizon (H) Computed ResponseResponse for i th cell Value (in cycles)1 Arb hard (5 ,

5) - 1 ≤ i ≤ Arb hardAssume (5 , ,

2) - i = 1 23 2 ≤ i ≤ Arb soft (5 ,

3) (H = 1) 1 ≤ i ≤ ∞ Arb soft (5 ,

3) (H = 2) 1 ≤ i ≤ ∞ ≤ i ≤ Arb soft (5 ,

3) (H > = 3) 1 ≤ i ≤ ∞ ≤ i ≤ Table. 5 gives worst case latency measurements carried out using tool CTLDCfor various controllers synthesized using DCSynth. For Arbiter examples, worstcase response time (maximum number of cycles a request remains true con-tinuously, without an acknowledgment) is measured using a CTLDC formulaMAXLEN([[ req i && ! ack i ]]), for each cell i of various arbiters discussed in sec-tion 5. We use arbiter variants with 5 cells (i.e. 1 ≤ i ≤

5) for our experiments. – The speciﬁcation

Arb hard and

Arb hardAssume do not have soft requirements,therefore guided synthesis will choose an arbitrary output from the con-structed MPS, without any value iteration (i.e. H = 1). The results for theseare described as follows • Arb hard (5 ,

5) has worst case response time for each cell as 5 cycles, thiswould happen when all the request lines are continuously on and thecontroller gives acknowledgment to each cell in round robin fashion. • Arb hardAssume (5 , ,

2) has worst case response for ﬁrst cell is 2 cycles,whereas for all the other cells it is 3 cycles, provided the assumptionsare met. If assumptions are not met, then 3 cycle response cannot beguaranteed (If request from all 5 cells is on continuously). Assumptionput a constraint that at most 2 requests can be on at any point of time. – For the speciﬁcation

Arb soft (5 ,

3) the response requirement is that all thecell should get an acknowledgment within 3 cycles if the request is con-tinuously true ( it would be unrealizable if we use only hard requirement ).However, a controller which satisﬁes these requirements as much as possiblewas generated using DCSynth. • For example,

Arb soft (5 ,

3) “tries” to give acknowledgment within 3 cycleswith higher priority assigned to higher numbered cell (see the descriptionin Section 5). However, when all the requests are on simultaneously then req gets the highest priority and hence can always have worst caseresponse time of 3 cycles, but req given the lowest priority may endp with worst case response time of ∞ (when the request from highernumber cell is always true). • Another important observation is that DCSynth may generates diﬀer-ent controllers for diﬀerent horizons (value iterations) given for MPHOScomputation. More intuitively, as the value of horizon tends to ∞ , thecontroller produced reaches closer to the global optimality. This eﬀectcan be seen from row number 4–9, where the horizon moves from 1 tomore than 2. For horizon 1, DCSynth produces a locally optimal con-troller and hence the controller produced only guarantees the responsetime for the highest priority cell (i.e. cell no. 5, see row number 5). Forall other cells the worst case response is ∞∞