On the Power of Automata Minimization in Temporal Synthesis
OOn the Power of Automata Minimization inTemporal Synthesis
Shufang Zhu ∗ , Lucas M. Tabajara † , Geguang Pu ∗‡ and Moshe Y. Vardi †∗ Shanghai Industrial Control Safety Innovation Technology Co., Ltd, Shanghai, China † Rice University, Houston, USA ‡ East China Normal University, Shanghai, China
Abstract —Reactive synthesis is the problem of automaticallyconstructing a reactive system from a formal specification, withthe guarantee that the executions of the system align with thespecification. The specification is often described in temporallogic. Some types of specifications can be converted into de-terministic finite automata (DFA) as an intermediate step insynthesis, thus benefiting from the fact that DFAs can be fullyminimized in polynomial time. In this work we investigate DFA-minimization algorithms in the context of temporal synthesis. Inparticular, we compare between the Hopcroft and Brzozowskiminimization algorithms, adapting them to start from temporal-logic formulas and integrating them into an existing temporalsynthesis framework. While earlier studies comparing the twoalgorithms for randomly-generated automata concluded thatneither algorithm dominates, our results suggest that in thecontext of temporal-synthesis, Hopcroft’s algorithm is the bestchoice. Analyzing the results, we observe that the reason for thepoor performance of Brzozowski’s algorithm is a discrepancybetween theory and practice. This algorithm first constructsa DFA for the reverse language of the specification and thenperforms a series of operations to transform it into a minimalDFA for the specification itself. In theory, the DFA for the reverselanguage can be exponentially smaller, which would potentiallymake this algorithm more efficient than directly constructing theDFA for the original specification. In practice, however, we findthat the reverse DFA is often of comparable size or even larger,which cancels the advantage that this approach could have.
I. I
NTRODUCTION
Designing correct hardware modules is challenging, whichis why formal verification has become so crucial in the processof hardware design. Once a formal specification is written fora piece of software, however, it might be necessary to redesignthe system multiple times to correct bugs until it actuallyadheres to the specification. It has been proposed that a betteralternative would be to automatically generate the hardwaredesign from this high-level specification, thus obtaining a fulldesign that is correct by construction [1]. This approach isknown as reactive synthesis [2], [3].Reactive synthesis algorithms usually make use ofautomata-theoretic techniques to convert a high-level spec-ification into a system design. In general, the automatabeing manipulated are deterministic automata over infinitewords [2], [4]. For particular types of specifications, however,such as those in the safety fragment of Linear TemporalLogic (SafetyLTL), finite-word automata suffice [5]. Similarlyto automata over infinite words, in the worst case, the smallestdeterministic finite automaton (DFA) equivalent to a given temporal-logic formula can be doubly exponential in the sizeof the formula [6]. Yet, unlike their infinite-word counterparts,DFAs support efficient minimization algorithms, and there isevidence that the doubly-exponential blow-up of the minimalDFA in fact does not tend to occur in the average case [7].This means that synthesis applications that require compilingtemporal-logic formulas to DFAs can still be implementedefficiently for many instances, as long as a good minimizationalgorithm is used. The natural question to ask, then, is: what isthe best way in practice of constructing a minimal DFA froma temporal-logic formula?An empirical evaluation of DFA minimization algorithmscan be found in [8]. That work compares two classic al-gorithms for constructing a minimal DFA from a nonde-terministic finite automaton (NFA). The first is Hopcroft’salgorithm [9], which first determinizes the NFA into a (notnecessarily minimal) DFA, and then partitions the state spaceinto equivalence classes. These equivalence classes then cor-respond to the states of the minimal DFA. The second isBrzozowski’s algorithm [10], which reverses the automatontwice, determinizing and removing unreachable states aftereach reversal. If this sequence of operations is performed inorder, it is guaranteed that the resulting DFA will be minimal.The conclusion reached by [8] is that neither algorithmdominates across the board, and the best algorithm to use fora given NFA depends in part on the NFA’s transition density.A few aspects make this evaluation unsatisfactory for ourpurposes, however. First, the algorithms were compared con-sidering an NFA as a starting point, while we are interested inobtaining minimal DFAs from temporal-logic formulas. Thisdifference in initial representation may require certain stepsof the algorithms to be implemented in a different way thataffects their complexity. Second, the evaluation was performedon NFAs generated using a random model, which might notbe representative of automata obtained from such formulas.Third, automata generated from formulas tend to be semi-symbolic, having their transitions represented symbolically bydata structures such as Binary Decision Diagrams (BDDs).This representation can also affect how certain operations areimplemented and therefore the performance of the algorithms.With these issues in mind, in this work we reexamine thecomparison between the Hopcroft and Brzozowski algorithms,this time using benchmarks from reactive synthesis of LinearTemporal Logic over finite traces (LTL f ) [11]. In the standard a r X i v : . [ c s . F L ] A ug pproach for solving this problem, the LTL f specification isfirst converted into a DFA, and then a system satisfying thespecification is synthesized by solving a reachability gameover this DFA [12]. The game-solving step is commonlyperformed over a fully-symbolic representation of the DFAthat also encodes the state space symbolically in a logarithmicnumber of Boolean variables, and which can therefore beexponentially smaller than the explicit representation [13].Since the performance of both steps depends on the DFA size,efficient minimization algorithms are crucial for this problem.In our evaluation, Hopcroft’s algorithm is represented bythe tool MONA [14], which is the standard tool used forconstructing DFAs from temporal formulas.
MONA constructsa DFA in a bottom-up fashion, first constructing small DFAsfor subformulas and then progressively combining them whileapplying Hopcroft minimization after each step. Since thereis no preexisting tool that makes use of Brzozowski’s al-gorithm, we describe how it can be effectively simulatedwithin the existing framework of LTL f synthesis. This isdone by using MONA to construct a minimal DFA for the reverse language of the LTL f specification, which can thenbe reversed, determinized and pruned of unreachable statesto obtain the minimal DFA for the original language. Thepossible advantage of Brzozowski’s algorithm is that the DFAfor the reverse language of an LTL f formula is guaranteed tobe at most exponential, rather than doubly exponential, in thesize of the formula [11], [15].We present two approaches for performing the final deter-minization step in Brzozowski’s algorithm: one explicit usingroutines implemented in the SPOT automata library [16], andone symbolic which converts the reversed automaton directlyinto a fully-symbolic representation. The benefit of the sym-bolic approach is that it avoids constructing the semi-symbolicDFA, which may be exponentially larger than its fully-symbolic representation. On the other hand, this also meansthat the DFA is converted to the fully-symbolic representationbefore removing the unreachable states, which can lead tothis representation being more complex than necessary. Afterthe fully-symbolic representation is constructed, computing thereachable states serves only to reduce the search space duringsynthesis, but does not reduce the size of the representation.We compare Hopcroft’s and the two versions of Brzo-zowski’s algorithm on a number of LTL f -synthesis bench-marks, evaluating not only the performance of the DFAconstruction but also how the resulting fully-symbolic DFAaffects the game-solving. We find that, despite the minimalDFA for the reverse language having an exponentially smallertheoretical upper bound compared to the minimal DFA for theoriginal language, in practice it is often a similar size or evenlarger. Furthermore, our results suggest that the Brzozowskialgorithm does not benefit from having the determinizationperformed symbolically, as the fully-symbolic representationbecomes much less efficient and the reachability computationdoes not compensate for the overhead. As a consequence, weobserve that Hopcroft’s algorithm dominates significantly inthe wide majority of the cases. We thus conclude that unlike in [8], where the evaluation indicated that both minimizationalgorithms perform well in different cases, in the context ofsynthesis the Hopcroft approach is likely preferable.II. P RELIMINARIES
A. LTL f and Past LTL f Linear Temporal Logic over finite traces, called LTL f [11]extends propositional logic with finite-horizon temporal con-nectives. In particular, LTL f can be considered as a variant ofLinear Temporal Logic (LTL) [17]. The key feature of LTL f is that it is interpreted over finite traces, rather than infinitetraces as in LTL. Given a set of propositions P , the syntax ofLTL f is identical to LTL, and defined as: φ ::= (cid:62) | ⊥ | p ∈ P | ( ¬ φ ) | ( φ ∧ φ ) | ( Xφ ) | ( φ U φ ) . (cid:62) and ⊥ represent true and false respectively. X (Next)and U (Until) are temporal connectives. Other temporal con-nectives can be expressed in terms of those. A trace ρ = ρ [0] , ρ [1] , . . . is a sequence of propositional assignments (sets),in which ρ [ m ] ∈ P ( m ≥ ) is the m -th assignment of ρ , and | ρ | represents the length of ρ . Intuitively, ρ [ m ] is interpretedas the set of propositions that are true at instant m . A trace ρ is infinite if | ρ | = ∞ , denoted as ρ ∈ (2 P ) ω , otherwise ρ is finite , denoted as ρ ∈ (2 P ) ∗ . We assume standard temporalsemantics from [11], and we write ρ | = φ , if finite trace ρ satisfies φ at timepoint . We define the language of a formula φ as the set of traces satisfying φ , denoted as L ( φ ) .We now introduce PastLTL, which specifies temporal prop-erties over linear traces, but considering the past instead of thefuture. We restrict PastLTL as a syntactical fragment of PLTL,introduced in [17], so that it considers only past temporalconnectives. PastLTL is defined as follows: θ ::= (cid:62) | ⊥ | p ∈ P | ( ¬ θ ) | ( θ ∧ θ ) | ( Y θ ) | ( θ Sθ ) . We assume standard temporal semantics [18]. It should benoted that since PastLTL talks only about past histories, ineffect it reasons about finite traces. This is because a temporalformula is asserted with respect to a specific position alonga trace. Thus, when considering only past connectives, whenevaluating a formula θ at timepoint i of an infinite trace ρ , thetruth of the formula depends only on the prefix ρ [0] , . . . , ρ [ i ] .We therefore can consider PastLTL as a finite-trace temporallogic. If ρ is a finite trace, then we say that ρ | = θ , if ρ, k − | = θ , where k = | ρ | . We define the language L ( θ ) as the set offinite traces satisfying θ , that is, L ( θ ) = { ρ | ρ | = θ } .We can reverse an LTL f formula by replacing each temporalconnective in φ with the corresponding past connective fromPastLTL thus getting φ R . X (Next) and U (Until) are replacedby Y and S , respectively. For a finite trace ρ , we define ρ R = ρ [ | ρ | − , ρ [ | ρ | − , . . . , ρ [1] , ρ [0] to be the reverse of ρ . Wedefine the reverse of L as the set of reversed traces in L ,denoted as L R ; formally, L R = { ρ R | ρ ∈ L} . The followingtheorem shows that PastLTL formula φ R accepts exactly thereverse language of φ . Theorem 1. [19]
Let L ( φ ) be the language of LTL f formula φ and L R ( φ ) be the reverse language, then L R ( φ ) = L ( φ R ) . . Automata Representations An LTL f formula can be represented by an automaton overfinite words that accepts a trace if and only if that trace satisfiesthe formula. Here we introduce a few different automatarepresentations. The difference between the representationshere and the standard textbook representation of finite-stateautomata is that the alphabet is defined in terms of truthassignments to propositions. Definition 1 (Nondeterministic Finite Automata (NFA)) . AnNFA is represented as a tuple N = ( P , S , S , η, Acc ) , where • P is a set of propositions; • S is a set of states; • S ⊆ S is a set of initial states; • η : S × P → S is the transition function such thatgiven current state s ∈ S and an assignment σ ∈ P , η returns a set of successor states; • Acc ⊆ S is the set of accepting states.
If there is only one initial state s and η returns a uniquesuccessor for each state s ∈ S and assignment σ ∈ P , thenwe say that N is a deterministic finite automaton (DFA). Inthis case, η can be written in the form of η : S × P → S .The set of traces accepted by N is called the language of N and denoted by L ( N ) .A question that remains is how to represent the transitionfunction η efficiently. It could be represented by a tablemapping states and assignments in P to the set of successorstates, but this table would necessarily be exponential in thenumber of propositions. In practice, from a given state it isusually the case that multiple assignments can lead to a samesuccessor state. These assignments can then be representedcollectively by a single Boolean formula λ . For a given state,the number of such formulas is usually much smaller than thenumber of assignments.Therefore, the transition function can alternatively be rep-resented by a relation H : S × Λ × S , where Λ is a set ofpropositional formulas over P . We then have ( s, λ, s (cid:48) ) ∈ H for a formula λ , iff s (cid:48) ∈ η ( s, σ ) for every σ ∈ P thatsatisfies λ . Intuitively, the tuples of H can be thought of asedges in the graph representation of the automaton, labeledby the propositional formulas that match the transitions. Itshould be noted that MONA [14] adopts this representation,representing propositional formulas as Binary Decision Dia-grams (BDDs) [20].We call the above a semi-symbolic automaton representa-tion, as transitions are represented symbolically by proposi-tional formulas but the states themselves are still representedexplicitly. In contrast, we now present a fully-symbolic ( sym-bolic for short) representation, in which both states andtransitions are represented symbolically. In the fully-symbolicrepresentation, states are encoded using a set of state variables Z , where a state corresponds to an assignment of thesevariables. Definition 2 (Symbolic Deterministic Finite Automaton) . A symbolic DFA of a corresponding explicit DFA A = ( P , S , s , η, Acc ) , in which η is in the form of η : S × P → S ,is represented as a tuple D = ( P , Z , I, δ, f ) , where • P is the set of propositions as in A ; • Z is a set of state variables with |Z| = (cid:100) log |S|(cid:101) ,and every state s in the explicit DFA corresponds to anassignment Z ∈ Z of propositions in Z ; • I ∈ Z is the initial assignment corresponding to s ; • δ : 2 Z × P → Z is the transition function. Given theassignment Z of current state s together with transitioncondition σ , δ ( Z, σ ) returns the assignment Z (cid:48) corre-sponding to the successor state s (cid:48) = η ( s, σ ) ; • f is a propositional formula over Z describing theaccepting states; that is, each satisfying assignment Z of f corresponds to an accepting state s ∈ Acc . Note that the transition function δ can be represented byan indexed family consisting of a Boolean formula δ z foreach state variable z ∈ Z , which when evaluated over anassignment to Z ∪ P returns the next assignment to z . Sincethe states are encoded into a logarithmic number of statevariables, depending on the structure of these formulas, thesymbolic representation can be exponentially smaller than thesemi-symbolic representation. C. DFA Canonization
For every NFA, there exists a unique smallest DFA thatrecognizes the same language, called the canonical or minimal DFA. A typical way to construct the canonical DFA for a givenNFA is to determinize the automaton using subset construc-tion [21] and then minimize it using, for example, Hopcroft’sDFA minimization algorithm [9]. The idea of Hopcroft’salgorithm is as follows. Consider state s in automaton N ,we define L ( s ) as the set of accepting words of N having s as the initial state. Note that a minimal DFA cannot have twodifferent states such that L ( s ) = L ( s (cid:48) ) . Hopcroft’s algorithmcomputes equivalence classes of states, such that two states s and s (cid:48) are considered equivalent if L ( s ) = L ( s (cid:48) ) . Theorem 2 (Hopcroft) . Let N be an NFA. Then D =[ equivalence ◦ determinize ]( N ) is the minimal DFA accept-ing the same language as N . Function equivalence ( N ) partitions the set of states intoequivalence classes, which then form the states in the canon-ical DFA. The initial partition is Acc and S\ Acc , which areclearly not equivalent. At each iteration this partition is refinedby splitting each equivalence class, until no longer possible.
MONA , the state-of-the-art practical tool for constructingminimal DFA from Monadic Second-Order Logic formu-las [14], [22], operates by induction on the structure ofthe input formulas, constructing DFAs for the subformulasand then combining them recursively, while using Hopcroft’salgorithm to minimize after each step. We thus say that
MONA follows the Hopcroft approach for minimization, though anoptimized variant adapted to the semi-symbolic automata usedby
MONA . For more details, we refer to [14].3 less direct way to construct canonical DFAs, in whichthere is no explicit step of minimization, is due to Br-zozowski [10]. We use the following formulation of Br-zozowski’s approach [8]. For notation, reverse ( N ) is thefunction that maps the NFA N = ( P , S , S , η, Acc ) to theNFA N R = ( P , S , Acc, η R , S ) , where ( s (cid:48) , σ, s ) ∈ η R iff ( s, σ, s (cid:48) ) ∈ η ; determinize ( N ) is the DFA obtained byapplying subset construction to N ; and reachable ( N ) is theautomaton resulting from removing all states that are notreachable from the initial states of N . Theorem 3 (Brzozowski) . Let N be an NFA. Then N (cid:48) =[ reachable ◦ determinize ◦ reverse ] ( N ) is the minimal DFAaccepting the same language as N . In short, Theorem 3 tells us that by applying to an NFA tworounds of reversal, determinization, and pruning unreachablestates, in that order, the resulting DFA is canonical.
D. Symbolic LTL f synthesis The problem of LTL f synthesis is defined as follows: Definition 3 (LTL f Synthesis) . Let φ be an LTL f formulaover P and X , Y be two disjoint sets of propositions such that X ∪ Y = P . X is the set of input variables and Y is the set of output variables . φ is realizable with respect to (cid:104)X , Y(cid:105) if thereexists a strategy g : (2 X ) ∗ → Y , such that for an arbitraryinfinite sequence π = X , X , . . . ∈ (2 X ) ω of propositionalassignments over X , we can find k ≥ such that φ is true in the finite trace ρ = ( X ∪ g ( (cid:15) )) , ( X ∪ g ( X )) , . . . , ( X k ∪ g ( X , X , . . . , X k − )) . Intuitively, LTL f synthesis can be thought of as a gamebetween two players: the environment , which controls the inputvariables, and the system , which controls the output variables.Solving the synthesis problem means synthesizing a strategy g for the system such that no matter how the environmentbehaves, the combined behavior trace of both players satisfythe logical specification φ [12]. There are two versions ofthe synthesis problem depending on which player acts first.Here we consider the system as the first player, but a versionwhere the environment moves first can be obtained with smallmodifications. In both versions, however, the system decidesat which round to end the game.The state-of-the-art approach to LTL f synthesis is the sym-bolic approach proposed in [22]. This approach first translatesthe LTL f specification to a first-order formula, which is thenfed to MONA to get the fully-minimized semi-symbolic DFA.This DFA is transformed to a fully-symbolic DFA, usingBDDs [20] to represent each δ z as well as the formula f for the accepting states. Solving a reachability game overthis symbolic DFA settles the original synthesis problem. Thegame is solved by performing a least fixpoint computationover two Boolean formulas w over Z and t over Z ∪Y , whichrepresent the set of all winning states and all pairs of winningstates with winning outputs, respectively. Winning state andwinning output are defined below.
Definition 4 (Winning State and Winning Output) . Given asymbolic DFA D = ( X ∪ Y , Z , Z , δ, f ) , Z ∈ Z is a winningstate if Z is an accepting state, that is, Z | = f , or there exists Y ∈ Y such that, for every X ∈ X , δ ( X, Y, Z ) is a winningstate. This Y is the winning output of winning state Z . t and w are initialized as t ( Z, Y ) = f ( Z ) and w ( Z ) = f ( Z ) , since every accepting state is a winning state. Then t i +1 and w i +1 are constructed as follows: • t i +1 ( Z, Y ) = t i ( Z, Y ) ∨ ( ¬ w i ( Z ) ∧ ∀ X.w i ( η ( X, Y, Z ))) • w i +1 ( Z ) = ∃ Y.t i +1 ( Z, Y ) The computation reaches a fixpoint when w i +1 ≡ w i . Atthis point, no more states will be added, and so all winningstates have been found. By evaluating w i on Z we can knowif there exists a winning strategy. If that is the case, t i canbe used to compute a winning strategy. This can be donethrough the mechanism of Boolean synthesis [23]. For moredetails, we refer to [22]. We note that extensions of LTL f synthesis were studied in [13] and [24]. In all of these works,constructing DFAs corresponding to LTL f formulas proved tobe the computational bottleneck.III. B RZOZOWSKI ’ S A LGORITHM FROM
LTL f As mentioned in Section II-C, a variation of Hopcroft’salgorithm is already implemented in the tool
MONA , whichis the standard tool employed in LTL f synthesis applications.In contrast, there is no existing tool that directly implementsBrzozowski’s algorithm for temporal specifications. In thissection, we describe how the algorithm can be adapted toconstruct a minimal DFA from an LTL f formula, presentingboth an explicit and symbolic version of the algorithm. Thissection focuses on the theory of the algorithm; implementationdetails can be found in Section IV-A.Theorem 3 describes how to obtain a minimal DFA from anNFA, but in the context of LTL f synthesis we would like tostart instead from an LTL f formula. This leads to the followingsequence of operations:1) Reverse DFA construction:
Construct a DFA that recog-nizes the reverse of the language of ϕ . This correspondsto the first round of reachable ◦ determinize ◦ reverse .2) Reversal into a co-DFA:
Reverse again the DFA forthe reverse language into a codeterministic finite automa-ton (co-DFA) for the original language. This correspondsto the reverse operation in the second round.3)
Determinization and pruning:
The last two steps, corre-sponding to the determinize and reachable operations,can be performed either explicitly or symbolically.a)
Explicit:
Apply subset construction to the co-DFA toobtain an explicit DFA, removing states that are notreachable from the initial states.b)
Symbolic:
Convert the explicit co-DFA into a symbolicDFA (defined in Section II-B). Next, compute a sym-bolic representation of the set of reachable states of thesymbolic DFA. Since removing states cannot be easilydone in the symbolic representation, the symbolic set4f reachable states is instead used later to prune thesearch during the game-solving step.We now describe each of these steps in more detail. Proofsfor the theorems can be found in the Appendix.
A. Reverse DFA Construction
Starting from an LTL f formula φ , we first produce aminimal DFA for the reverse language of φ . Note that, beingminimal, this DFA has no unreachable states. Therefore, itcorresponds to a DFA obtained by applying the first roundof operations reachable ◦ determinize ◦ reverse . Althoughminimality is a stronger condition than reachability, having theDFA be minimal improves the performance of future steps.To construct such a DFA, we can use a technique introducedin [19]. In the first step, we convert the LTL f formula φ intoa PastLTL formula φ R for the reverse language. This can beeasily done by simply replacing every temporal connective in φ with its corresponding past connective. Since PastLTL can betranslated to first-order logic [19], φ R can be converted into aDFA and minimized, for example using Hopcroft’s algorithm.It might seem odd to generate a minimal DFA for φ R as anintermediate step in a minimization algorithm, when one couldsimply directly generate the minimal DFA for the originalformula φ . The difference, however, is that the minimal DFAfor the reverse language of an LTL f formula is guaranteed tohave size at most exponential in the size of the formula [11],[15], while the DFA for the formula itself can be doubly-exponential [6]. Specifically, it is shown in [11] how to convertan LTL f formula to an alternating word auotomaton withlinear state blow-up, and it is shown in [15] how to convert analternating word automaton to a DFA for the reverse languagewith exponential state blow-up. Therefore, constructing theDFA for the reverse language is, in theory, exponentially moreefficient than the direct construction. More detailed discussionscan be found in the Appendix. Theorem 4.
Let φ be an LTL f formula, A be the minimal DFAfor φ R . Then A (cid:48) = [ reachable ◦ determinize ◦ reverse ]( A ) is the minimal DFA accepting the same language as φ . Theorem 4 states that the construction of the minimal DFAfor the reverse language corresponds to the first round of reachable ◦ determinize ◦ reverse in Theorem 3. In theremainder of this section we describe how to perform thesecond round of operations. B. Reversal into a Co-DFA
The first step produces a minimal DFA A for the reverselanguage of φ . Reversing this DFA can be done easily andin linear time, by swapping initial states with final statesand swapping source and destination for every transition. Theresult is known as a codeterministic finite automaton (co-DFA) [25] for the original language of the formula. Just asa DFA is a special case of an NFA, namely one with only asingle transition out of a state for each assignment, a co-DFAis a special case of an NFA in which there is only a single transition into a state for each assignment. That is, an NFAwith transition function η is a co-DFA if for every s ∈ η ( s (cid:48) , σ ) and s ∈ η ( s (cid:48)(cid:48) , σ ) it is the case that s (cid:48) = s (cid:48)(cid:48) .More formally, let A = ( P , S , S , H, Acc ) be the semi-symbolic DFA for the reverse language, with the (determinis-tic) transition relation given as H ⊆ S × Λ × S , where Λ isa set of propositional formulas, as described in Section II-B.This representation of the transition relation is easy to obtainfrom the output of MONA . Reversing A produces the co-DFA C = ( P , S , Acc, H R , S ) , where H R = { ( s (cid:48) , λ, s ) | ( s, λ, s (cid:48) ) ∈ H } . C. Explicit Minimal DFA Construction
The standard way of determinizing an NFA is using subsetconstruction. This construction can be performed in the semi-symbolic representation, with explicit states and symbolictransitions. In this case, each state in the resulting DFArepresents a subset of the states in the NFA, and a transitionbetween two states representing subsets S and S is labeledby the disjunction of all labels λ such that ( s , λ, s ) ∈ H for s ∈ S and s ∈ S . In this semi-symbolic representation,finding the reachable states can be performed by a simplegraph search on the graph of the automaton.The problem with this explicit-state approach is that thesubset construction causes an exponential blowup in the statespace. This blowup can nullify the advantage obtained byconstructing an exponential DFA for φ R rather than a doubly-exponential DFA for φ . In the next section we describe howthis problem may be mitigated by instead directly constructinga fully-symbolic representation of the DFA. D. Symbolic Minimal DFA Construction
As described in Section II-D, the state-of-the-art approachfor solving LTL f synthesis uses a fully-symbolic representa-tion of the DFA, which as noted in Section II-B can be ex-ponentially smaller. Therefore, the exponential blowup causedby the explicit subset construction described in Section III-Cabove might be canceled out when the DFA is made symbolic.Constructing the explicit DFA, then, seems like a waste thatcould be prevented by directly obtaining a symbolic DFAfrom the explicit co-DFA. With that in mind, in this sectionwe describe an alternative approach that performs the subsetconstruction and pruning of unreachable states symbolically.
1) Symbolic Subset Construction:
The intuition of the sym-bolic determinization procedure is that, after subset construc-tion, each state in the DFA corresponds to a set of NFA states.Each DFA state can therefore be represented by an assignmentto a set of Boolean variables, one for each NFA state, wherethe variable is set to true if the corresponding state is in theset. This corresponds naturally to a symbolic representation D where each explicit state of the NFA is a state variable in D .Therefore, the set of NFA states S is overloaded as the set ofstate variables in D . Moreover, in addition to denote a set ofNFA states, S here is also used to denote a DFA state. In thiscase, S is able to encode the entire state space of D .5ecall that the transition function δ : 2 S × P → S in D can be represented as an indexed family { δ s | S × P →{ , } | s ∈ S} . Intuitively speaking, given current DFA state S ∈ S and transition condition σ ∈ P , δ s indicates whetherstate variable s is assigned as true or f alse in the successorstate. Variable s is assigned as true if there is a transitionin the NFA from a state in S that leads to s under transitioncondition σ , and f alse otherwise.Therefore, given an NFA N = ( P , S , S , H, Acc ) , thesymbolic determinization for the symbolic DFA D =( P , S , I, δ, f ) proceeds as follows: • S is the set of state variables; • I ∈ S is such that I ( s ) = 1 if and only if s ∈ S ; • f = (cid:87) s ∈ Acc s . • δ s : 2 S × P → { , } is such that δ s ( S, σ ) = 1 iff ( t, λ, s ) ∈ H for some t s.t. S ( t ) = 1 and λ s.t. σ | = λ .Each δ s can be represented by a formula (or BDD) δ s = (cid:87) ( t,λ,s ) ∈ H ( t ∧ λ ) , with t interpreted as a state variable.In order to show that the symbolic determinization describedabove is correct, i.e., that L ( D ) = L ( N ) , we need to provethat the state where the DFA D is after reading a trace ρ corresponds exactly to the set of states where the NFA N can be after reading ρ , which follows the standard subsetconstruction. In the following, we use δ ( S, ρ ) to denote theDFA state that is reached from S by reading ρ . Likewise, H ( S, ρ ) denotes the set S (cid:48) ⊆ S of all NFA states that can bereached from all states s ∈ S by reading ρ . Lemma 1.
Let ρ ∈ (2 P ) ∗ be a finite trace. DFA state δ ( I, ρ ) encodes the set of NFA states H ( S , ρ ) , that is, { s | δ ( I, ρ )( s ) = 1 } = H ( S , ρ ) . The following theorem then follows directly from Lemma 1.
Theorem 5. D is equivalent to N , that is, L ( D ) = L ( N ) . The following theorem states the computational complexityof the symbolic determinization described above. The limitingfactor is the construction of δ s , each of which takes lineartime. Therefore, the total complexity is quadratic. Theorem 6.
The symbolic determinization can be done inquadratic time on the size of the NFA.2) Symbolic State-Space Pruning:
The symbolic subsetconstruction described in the previous subsection allows usto obtain a DFA for φ directly in symbolic representation.Although this representation is more compact, however, itpresents further challenges for the final step of pruning un-reachable states. This is because in the symbolic DFA thestate space is fixed by the set of state variables. Since states arenot represented explicitly, but rather implicitly by assignmentsover these variables, there is no easy way to remove states.The alternative that we propose is to instead compute asymbolic representation of the set of reachable states, whichcan then be used during game-solving to restrict the searchfor a winning strategy. This means that the state space ofthe automaton, implicitly represented by the state variables, is not minimized, but during game-solving the additional,unreachable states are ignored.We denote by r ( Z ) the Boolean formula, over the set ofstate variables Z , that is satisfied by an assignment Z ∈ Z if Z encodes a state that is reachable from the initial state. Wecompute r ( Z ) for the symbolic DFA D = ( P , Z , I, δ, f ) byiterating the following recurrence until a fixpoint: r ( Z ) = I ( Z ) r i +1 ( Z ) = r i ( Z ) ∨ ∃ Z (cid:48) . ∃ X. ∃ Y.r i ( Z (cid:48) ) ∧ ( δ ( Z (cid:48) , X ∪ Y ) = Z ) Once a fixpoint is reached, the resulting formula r ( Z ) denotesthe set of reachable states of the automaton. Then, during thecomputation of the winning states as described in Section II-D,we restrict t i after each step to only those values of Z thatcorrespond to reachable states.IV. I MPLEMENTATION AND E VALUATION
In order to better compare the performance of Hopcroft’sand Brzozowski’s minimization algorithms in the context oftemporal synthesis, we conducted extensive experiments overdifferent classes of benchmarks. In our evaluation, Hopcroft’salgorithm is represented by
MONA [14].
MONA constructsa minimal DFA by first generating DFAs for subformulas,then combining them recursively while applying Hopcroft’salgorithm on the intermediate DFAs. For more implementationdetails of
MONA , we refer to [14]. In this section, wefirst present details of our implementation of Brzozowski’salgorithm, and then show an experimental comparison betweenthe two different minimization algorithms.
A. Implementation
As described in Section III, Brzozowski’s algorithm consistsof three steps: 1) reverse DFA construction, 2) reversal intoa co-DFA and 3) determinization and pruning. To performthe first step, we translate the PastLTL formula φ R into afirst-order formula f ol ( φ R ) following the translation in [19]and then use MONA to construct the minimal DFA for φ R .Reversing this DFA into a co-DFA for φ is straightforward. In-stead of implementing the reversal step as a separate operation,we optimize by performing the reversal while determinizing.There are two different versions of the determinization andpruning step: explicit and symbolic. We now elaborate onthem. Note that each version starts with the DFA of φ R , and wecombine the reversal and the subsequent operations of subsetconstruction followed by state-space pruning. Proofs for thetheorems can be found in the Appendix.
1) Explicit Minimal DFA Construction:
Inspired by [26],we borrow the rich APIs from
SPOT [16] to perform eachcomputation. In particular,
SPOT adopts the semi-symbolicrepresentation for automata, where the states are explicitand transitions are symbolic, and therefore, we use it toimplement the explicit approach. Note that
SPOT is a platformfor ω -automata (automata over infinite words) manipulation.Therefore we represent the co-DFA as a weak B¨uchi Automa-ton (wBA) [27].6ince the reversal step is straightforward, to simplify thedescription we consider the co-DFA as the starting point tobetter show the implementation details. The transformationto wBA follows the techniques presented in [26], where weintroduce a fresh proposition alive / ∈ P . Intuitively, if a co-DFA accepts language L ( C ) , then its wBA accepts infinitewords in { ( ρ ∧ alive ) · ( ¬ alive ) ω | ρ ∈ L ( C ) } , where ρ ∧ alive denotes that alive holds at each timepoint of finite trace ρ .Given co-DFA C = ( P , S , S , H, Acc ) , we construct the wBAas follows:1) introduce an extra state sink ;2) for each accepting state s in Acc , add a transition from s to sink , with transition condition ¬ alive ;3) for each transition in between states in C , change thetransition condition λ to λ ∧ alive ;4) add a self-loop for state sink on ¬ alive ;5) assign sink as the unique accepting state. Theorem 7.
Let C be a co-DFA, and B be the wBA generatedfrom the construction (1)-(5) above. Then we have L ( B ) = { ( ρ ∧ alive ) · ( ¬ alive ) ω | ρ ∈ L ( C ) } . Then, we are able to use SPOT APIs for wBAto conduct subset construction and unreachable statespruning, thus obtaining the wDBA DB . The corre-sponding API functions are tgba_powerset() and purge_unreachable_states() , respectively. Finally,we convert the wDBA DB back to DFA as follows:a) remove the unique accepting state { sink } and transitionsleading to or coming from { sink } ;b) eliminate alive on each transition by assigning it to true ;c) assign all the states that move to { sink } with transitioncondition ¬ alive as the accepting states of the DFA. Theorem 8.
Let C be a co-DFA, B the corresponding wBA, DB the wDBA from SPOT, and A be the DFA obtained fromthe construction (a)-(c) above. Then A is minimal.2) Symbolic Minimal DFA Construction: Instead of havingeach DFA state explicit, the symbolic approach described inSection III-D is able to directly construct a symbolic DFA asin Definition 2. Here, we follow the representation techniqueused by
MONA and [22], where the propositional formulasfor the transition function and accepting states are representedby Binary Decision Diagrams (BDDs). Moreover, the reversalstep is again combined with the symbolic subset constructionoperation into one step. Thus, for state variable s , in orderto construct BDD for formula δ s : 2 S × P → { , } , wehave to take care of switching the current and successor statesof a given transition from the reverse DFA. The same forexchanging BDDs of initial and accepting states.As for the state-space pruning, formulas represented asBDDs allow us to perform the usual Boolean operations,such as conjunction, disjunction and quantifier elimination.After obtaining the set of reachable states r ( Z ) , during thecomputation of the winning states as described in Section II-D,we need to restrict t i after each step to only those values of Z that correspond to reachable states. There are two ways in which we can do this. The most obvious way is to simplytake the conjunction of t i ( Z, Y ) with r ( Z ) , thus removing allassignments that correspond to unreachable states. The secondoption, since we are using BDDs to represent the Booleanformulas t i and r , is to apply the standard BDD Restrict operation [28]. The BDD produced by
Restrict ( t i , r ) stillreturns for all satisfying assignments to t i that also satisfy r .Those satisfying assignments that do not satisfy r , however,are selectively mapped to either or , using heuristics totry to make a choice that will lead to a smaller BDD. Thiscorresponds to essentially choosing to keep a subset of theunreachable states if that will lead to a smaller symbolicrepresentation of the set of winning states.Note that the predecessor computation used to compute t i +1 may add unreachable states, therefore it is necessary toapply the conjunction or restriction operation at every iteration,rather than just once. B. Experimental Evaluation
We compared the performance of Hopcroft’s and Brzo-zowski’s minimization algorithms on a set of benchmarkstaken from the literature. Since there are two ways of per-forming Brzozowski’s algorithm, as presented in Section III,we have in total 3 different approaches, namely Hopcroft, Ex-plicit Brzozowski and Symbolic Brzozowski. The benchmarksconsist of two classes. The first class of benchmarks is the
Random family, composed of 1000 LTL f formulas formed byrandom conjunction, generated in the way given in [13]. Thesecond class of benchmarks is from [29], and describes two-player games, split into the Single-Counter , Double-Counters and
Nim benchmark families. Since the formulation of LTL f synthesis used in [29] assumes that the environment acts first,the LTL f formulas had to be modified slightly to adapt to ourformulation, where the system acts first.The results shown in this section represent the end-to-end execution of the synthesis algorithms, from an LTL f specification to a winning strategy. Therefore, they includeboth the time to construct the automaton and to solve the game.All tests were ran on a computer cluster with exclusive accessto a node with Intel(R) Xeon(R) CPU E5-2650 v2 processorsrunning at 2.60GHz. Timeout was set to 1000 seconds.Note that, as introduced in Section IV-A2, during symbolicstate-space pruning we are able to either apply restrictionor conjunction to access only reachable states during game-solving. We show only the results using restriction, as thedifference is not significant and in most cases restriction givesslightly better results.Figure 1 shows a cactus plot comparing the three differentapproaches on Random benchmarks. The curves show howmany instances can be solved with a given timeout. The furthera curve is to the right of the graph, the more benchmarkscould be solved in less time. The graph shows that Hopcroft’sminimization algorithm is in fact able to solve many morecases than both versions of Brzozowski’s algorithm. Further-more, the explicit Brzozowski approach slightly outperformsthe symbolic one. In spite that with time limit of 10 seconds7 N u m be r o f s o l v ed i n s t an c e s Solving time (sec)Hopcroft minimizationExplicit Brzozowski minimizationSymbolic Brzozowski minimization
Fig. 1: Total Running time with Hopcroft’s or Brzozowski’sminimization algorithms on
Random benchmarks. c - c - c - c - c - cs - cs - cs - cs - cs - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - S o l v i ng t i m e ( s e c ) Hopcroft Explicit-Br Symbolic-Br
Fig. 2: Total Running time with Hopcroft’s or Brzozowski’sminimization algorithms on
Single-Counter , Double-Counters and
Nim benchmarks.the symbolic version is able to handle more cases than theexplicit one, if we take 1000 seconds as the time limit theexplicit version is able to handle in total more cases than thesymbolic one. This observation shows that the explicit versionis more scalable than the symbolic one.The results are not much different in the case of the non-random benchmarks, shown in Figure 2. There we can seethat symbolic Brzozowski’s algorithm timed out for the vastmajority of instances, only being able to solve the smallerinstances of the
Nim family, and none of the instances ofthe
Single-Counter and
Double-Counters families. The explicitversion performs slightly better, being able to handle somesmaller instances of the
Single-Counter and
Double-Counters families. Hopcroft’s algorithm, on the other hand, can solve alarge number of instances within the timeout.The results shown above allow us to answer the questionof which minimization algorithm is more efficient in thecontext of temporal synthesis. Our data points to Hopcroft’salgorithm as the better choice. It might seem surprising thatHopcroft’s algorithm outperforms Brzozowski so significantly.The symbolic version in particular was expected to benefitfrom the fact that it is able to avoid ever having to constructthe explicit DFA for the LTL f formula, instead constructingonly the DFA for the reverse language, which should beexponentially smaller. Yet, it fails to compete even againstthe explicit version. Understanding the failure of Brzozowski’s algorithm in the context of temporal synthesis requires a morein-depth investigation of the internals of the minimizationprocedure. We perform this analysis in the next section.V. A NALYSIS AND D ISCUSSION
To understand the reasons for the results in Section IV, wecompared the size of the minimal DFA for the formula with thesize of the DFA for the reverse language obtained in the firsthalf of the Brzozowski construction. We found that the resultsof this comparison did not match the expectation outlined inSection III-A that the reverse DFA should be exponentiallysmaller. We summarize here the results of this comparison.Details and plots can be found in the Appendix.For the non-random benchmarks, in most cases the reverseDFA is actually larger than the minimal DFA for the formula.The only exceptions are some of the smaller benchmarks inthe
Nim family. The counter benchmarks in particular allowus to measure the asymptotic growth of each automaton as thenumber of bits n of the counters increases, and we observethat both DFAs scale exponentially in n . This is expected forthe reverse DFA, but contradicts the expectation that the DFAshould be doubly-exponential. For the 1000 random formulas,the reverse DFA is smaller in many cases, as expected, butthere is a significant number of cases where the two automatahave approximately the same number of states. Even in thosecases where the reverse DFA is smaller, it is not exponentially smaller. Recall that the reverse DFA will be reversed againand determinized by subset construction, which can causean exponential blowup. Therefore, a reverse DFA that is notexponentially smaller might not compensate for the blowupcaused by determinization.The symbolic Brzozowski approach is particularly affectedby the size of the reverse DFA, since in this approach thenumber of state variables is linear in the number of statesof the reverse DFA, instead of logarithmic in the number ofstates of the minimal DFA. Therefore, if the reverse DFAis not exponentially smaller the symbolic representation islikely to be larger than the one produced by the explicitor Hopcroft constructions. Although the symbolic state-spacepruning during game-solving does significantly reduce theBDD sizes at each iteration, the computation of the set ofreachable states itself ends up consuming a majority of therunning time. Therefore, it turns out to not be helpful in gettingthe symbolic version of Brzozowski’s algorithm to scale.Therefore, although Brzozowski’s algorithm had the po-tential to take advantage of the smaller size of the reverseDFA, our results suggest that the worst-case exponential gapbetween DFA and reverse DFA might not be common enoughto make this approach worthwhile. In fact, in many cases thereverse DFA turns out to actually be the similar size of evenlarger. This is consistent with previous results from [7], thatcanonical DFA constructed from temporal formulas are oftenorders of magnitude smaller in practice than the correspondingNFA. Our conclusion, therefore, is that for the purpose ofsynthesis Hopcroft’s algorithm seems to be a better option forconstructing the minimal DFA than Brzozowski’s algorithm.8 EFERENCES[1] R. Bloem, S. J. Galler, B. Jobstmann, N. Piterman, A. Pnueli, andM. Weiglhofer, “Specify, Compile, Run: Hardware from PSL,”
Electron.Notes Theor. Comput. Sci. , vol. 190, no. 4, pp. 3–16, 2007.[2] A. Pnueli and R. Rosner, “On the Synthesis of a Reactive Module,” in
POPL , 1989.[3] R. Bloem, “Reactive Synthesis,” in
Proc. 15th Conf. on Formal Methodsin Computer-Aided Design . IEEE, 2015, p. 3.[4] S. Fogarty, O. Kupferman, M. Y. Vardi, and T. Wilke, “Profile Treesfor B¨uchi Word Automata, with Application to Determinization,” in
Proceedings Fourth International Symposium on Games, Automata,Logics and Formal Verification, GandALF 2013 , 2013, pp. 107–121.[5] S. Zhu, L. M. Tabajara, J. Li, G. Pu, and M. Y. Vardi, “A SymbolicApproach to Safety LTL Synthesis,” in
HVC , 2017, pp. 147–162.[6] O. Kupferman and M. Y. Vardi, “Model Checking of Safety Properties,”
Formal Methods in System Design , vol. 19, no. 3, pp. 291–314, 2001.[7] D. Tabakov, K. Y. Rozier, and M. Y. Vardi, “Optimized temporalmonitors for SystemC,”
Formal Methods in System Design , vol. 41,no. 3, pp. 236–268, 2012.[8] D. Tabakov and M. Y. Vardi, “Experimental Evaluation of ClassicalAutomata Constructions,” in
LPAR , 2005, pp. 396–411.[9] J. E. Hopcroft, “An n Log n Algorithm for Minimizing States in a FiniteAutomaton,” Stanford, CA, USA, Tech. Rep., 1971.[10] J. A. Brzozowski, “Canonical Regular Expressions and Minimal StateGraphs for Definite Events,” 1962.[11] G. De Giacomo and M. Y. Vardi, “Linear Temporal Logic and LinearDynamic Logic on Finite Traces,” in
IJCAI , 2013.[12] ——, “Synthesis for LTL and LDL on Finite Traces,” in
IJCAI , 2015.[13] S. Zhu, L. M. Tabajara, J. Li, G. Pu, and M. Y. Vardi, “A SymbolicApproach to Safety LTL Synthesis,” in
HVC , 2017, pp. 147–162.[14] J. G. Henriksen, J. L. Jensen, M. E. Jørgensen, N. Klarlund, R. Paige,T. Rauhe, and A. Sandholm, “Mona: Monadic Second-order Logic inPractice,” in
TACAS , 1995.[15] A. Chandra, D. Kozen, and L. Stockmeyer, “Alternation,”
J. ACM ,vol. 28, no. 1, pp. 114–133, 1981.[16] A. Duret-Lutz, A. Lewkowicz, A. Fauchille, T. Michaud, E. Renault,and L. Xu, “Spot 2.0 — A Framework for LTL and ω -automataManipulation,” in ATVA , 2016.[17] A. Pnueli, “The temporal logic of programs,” 1977, pp. 46–57.[18] O. Lichtenstein, A. Pnueli, and L. D. Zuck, “The Glory of the Past,” in
Logics of Programs , 1985, pp. 196–218.[19] S. Zhu, G. Pu, and M. Y. Vardi, “First-Order vs. Second-Order Encod-ings for LTL f -to-Automata Translation,” in TAMC , 2019, pp. 684–705.[20] R. E. Bryant, “Symbolic Boolean Manipulation with Ordered Binary-Decision Diagrams,”
ACM Comput. Surv. , vol. 24, no. 3, pp. 293–318,1992.[21] M. Rabin and D. Scott, “Finite automata and their decision problems,”
IBM Journal of Research and Development , vol. 3, pp. 115–125, 1959.[22] S. Zhu, L. M. Tabajara, J. Li, G. Pu, and M. Y. Vardi, “Symbolic LTL f Synthesis,” in
IJCAI , 2017, pp. 1362–1369.[23] D. Fried, L. M. Tabajara, and M. Y. Vardi, “BDD-Based BooleanFunctional Synthesis,” in
CAV , 2016.[24] S. Zhu, G. De Giacomo, G. Pu, and M. Y. Vardi, “LTL f Synthesis withFairness and Stability Assumptions,” in
AAAI , 2020.[25] J. Pin, “On the language accepted by finite reversible automata,” in
Automata, Languages and Programming, 14th International Colloquium,ICALP87, Karlsruhe, Germany, July 13-17, 1987, Proceedings , ser. Lec-ture Notes in Computer Science, T. Ottmann, Ed., vol. 267. Springer,1987, pp. 237–249.[26] S. Bansal, Y. Li, L. M. Tabajara, and M. Y. Vardi, “Hybrid compositionalreasoning for reactive synthesis from finite-horizon specifications,” 2019.[27] C. Dax, J. Eisinger, and F. Klaedtke, “Mechanizing the powersetconstruction for restricted classes of omega -automata,” in
ATVA , K. S.Namjoshi, T. Yoneda, T. Higashino, and Y. Okamura, Eds., 2007, pp.223–236.[28] F. Somenzi, “CUDD: CU Decision Diagram Package 3.0.0. Universiyof Colorado at Boulder,” 2016.[29] L. M. Tabajara and M. Y. Vardi, “Partitioning Techniques in LTL f Synthesis,” in
IJCAI , 2019, pp. 5599–5606. PPENDIX
AThis appendix contains proofs moved from the body of thepaper due to space limitations.
Brzozowski’s Algorithm from LTL f Theorem 4.
Let φ be an LTL f formula, A be the minimal DFAfor φ R . Then A (cid:48) = [ reachable ◦ determinize ◦ reverse ]( A ) is the minimal DFA accepting the same language as φ .Proof. Since A is the minimal DFA for φ R , we have L ( A ) = L ( φ R ) . Moreover, φ R accepts the reverse language of φ such that L ( φ R ) = L R ( φ ) , so we have L ( A ) = L R ( φ ) .As stated in Theorem 3, the first round of reachable ◦ determinize ◦ reverse returns a deterministic automatonthat contains only reachable states and accepts the reverselanguage of the original input. Apparently, A holds for all.Specifically, A is a minimal DFA such that A is determin-istic and contains only reachable state. Moreover, we have L ( A ) = L R ( φ ) such that A accepts the reverse language ofthe original input φ . Therefore, after applying the second roundof reachable ◦ determinize ◦ reverse over A , we get A (cid:48) , theminimal DFA of φ . Lemma 1.
Let ρ ∈ (2 P ) ∗ be a finite trace. DFA state δ ( I, ρ ) encodes the set of NFA states H ( S , ρ ) , that is, { s | δ ( I, ρ )( s ) = 1 } = H ( S , ρ ) .Proof. We prove this by an induction on the length of ρ . • Base Case | ρ | = 0: Then ρ = (cid:15) . Now δ ( I, (cid:15) ) = I and H ( S , (cid:15) ) = S . Therefore, { s | I ( s ) = 1 } = S holds. • Induction Hypothesis : Assume inductively that the state-ment holds for every ρ such that | ρ | < n . • Inductive Step : If | ρ | = n then ρ = uσ with | u | = n − and σ ∈ P . δ ( I, uσ ) = S n iff ∃ S n − ∈ S . δ ( I, u ) = S n − and δ ( S n − , σ ) = S n H ( S , u ) = { s | S n − ( s ) = 1 } holds frominductive hypothesis. Therefore, for all states s ∈ (cid:83) t ∈ H ( S ,u ) H ( { t } , σ ) , then δ s ( { t } , σ ) = 1 (where { t } also denotes an assignment where only t is set to true ).This is because δ s = (cid:87) ( t,λ,s ) ∈ H ( t ∧ λ ) and σ is suchthat σ | = λ . In this case, S n ( s ) = 1 holds. Therefore, S n ( s ) = 1 iff s ∈ (cid:83) t ∈ H ( S ,u ) H ( { t } , σ ) holds, andtherefore H ( S , ρ ) = { s | δ ( I, ρ )( s ) = 1 } is true. Theorem 5. D is equivalent to N , that is, L ( D ) = L ( N ) .Proof. To show the equivalence of D and N , we should provethat given ρ ∈ (2 P ) ∗ , ρ ∈ L ( D ) if and only if ρ ∈ L ( N ) ,where L ( N ) represents the set of accepted words of automaton N , the same for L ( D ) .An arbitrary finite trace ρ is accepted by NFA N if andonly if the state where N could stay after reading ρ is anaccepting state such that H ( S , ρ ) ∩ Acc (cid:54) = ∅ . We denote thisstate as s (cid:48) and s (cid:48) ∈ H ( S , ρ ) ∩ Acc . By Lemma 1 we have δ ( I, ρ )( s (cid:48) ) = 1 and f ( s (cid:48) ) = 1 . That is to say, δ ( I, ρ ) | = f holds, in which case the state where D stays after reading ρ is an accepting state. This leads to the conclusion that ρ isaccepted by D . Theorem 6.
The symbolic determinization can be done inquadratic time on the size of the NFA.Proof.
Consider semi-symbolic NFA N = ( P , S , S , H, Acc ) with n states. For each state s ∈ S , the formula δ s isconstructed by enumerating all possible predecessors of s .Thus, each δ s is computed with an O ( n ) time complexity.Therefore, constructing the transition function { δ s : 2 S × P →{ , } | s ∈ S} can be done in O ( n ) time. Moreover, bothof I , the initial state, and f , the set of accepting states, canbe computed with O ( n ) time complexity. We then concludethat the symbolic determinization can be done in O ( n ) timeon the number of states n in the NFA. Implementation and Evaluation
The following theorem states that given co-DFA C , the ac-cepting language of which is L ( C ) , following the construction(1)-(5) described in Section IV-A, we are able to obtain aweak B¨uchi Automaton (wBA) B , with accepting language L ( B ) = { ( ρ ∧ alive ) · ( ¬ alive ) ω | w ∈ L ( C ) } . Theorem 7.
Let C be a co-DFA, and B be the wBA generatedfrom the construction (1)-(5) above. Then we have L ( B ) = { ( ρ ∧ alive ) · ( ¬ alive ) ω | w ∈ L ( C ) } .Proof. For every finite trace ρ = ρ [0] , ρ [1] , . . . , ρ [ | ρ |− ∈ P ∗ that can be accepted by C , there is an accepting run that startsfrom an initial state s ∈ S , and ends at an accepting state t ∈ Acc . According to the construction (1)-(5) described above,we have that for infinite trace ( ρ ∧ alive ) · ( ¬ alive ) ω , thereexists a run over B that starts from s and visits t after reading ρ [ | ρ | − ∧ alive . Next, this run proceeds to state sink afterreading the first ¬ alive and stays there forever with ¬ alive ω .Thus ( ρ ∧ alive ) · ( ¬ alive ) ω is indeed accepted by B .For every infinite trace ( ρ ∧ alive ) · ( ¬ alive ) ω that can beaccepted by B , where ρ = ρ [0] , ρ [1] , . . . , ρ [ | ρ |− ∈ P ∗ , thereis an accepting run that starts from an initial state s ∈ S ,and transits to the unique accepting state sink after readingthe first ¬ alive , then gets stuck there forever with ¬ alive ω .According to the construction (1)-(5) described above, we havethat for finite trace ρ , there exists a run over C that starts from s and visits t after reading ρ [ | ρ | − . Since there is an edgefrom t to sink on ¬ alive , t is an accepting state of C . Thus ρ ∈ L ( C ) .The following theorem shows the correctness of transforma-tion (a)-(c) from wDBA to DFA described in Section IV-A. Theorem 8.
Let C be a co-DFA, B the corresponding wBA, DB the wDBA from SPOT, and A be the DFA obtained fromthe construction (a)-(c) above. Then we have A is minimal.Proof. First we prove that { sink } is the only reachable subsetcontaining sink . Let { s , . . . , s k } be a reachable subset, andconsider the transitions taken when alive is false. Every s i ∈ cc moves into sink . For every s j (cid:54)∈ Acc , every transitionis of the form λ ∧ alive , and therefore no transition is takenwhen alive is false. Therefore, the successor subset is { sink } .Since there is no way to reach sink when alive is true, noother subset containing sink is reachable. Furthermore, since sink only has a transition back to itself, so does { sink } .Since { sink } is the only state containing sink in DB , everyother state in DB corresponds to a subset of non- sink states in B , or more specifically to a subset of states in the co-DFA C .Thus after removing { sink } and eliminating proposition alive from DB , the automaton A that we get is exactly the sameas the result of performing subset construction and removingunreachable states over C . According to Theorem 3, we havethat A is minimal. A PPENDIX
BTo understand the reasons for the results that we observedin Section IV, we have taken a closer look at the comparisonbetween the minimal DFA for the formula and the DFA for thereverse language obtained in the first half of the Brzozowskiconstruction. We started by measuring the number of statesof the DFA and reverse DFA constructed for each of the1000
Random formulas. Figure 3 displays a scatter plot (inlog scale) comparing the two for each instance . The bluecurve indicates an exponential blowup of the number of statesof the reverse DFA. In addition, the gray curve representsthe points where the x-axis value is equal to the y-axisvalue. Thus points above the gray curve represent instanceswhere the canonical DFA is larger than the reverse DFA. Inseveral cases, the reverse DFA is indeed smaller, as expected.We observe, however, that there is a significant number ofcases where the two automata tend to have approximately thesame number of states, which differs from what the theorywould lead us to believe. In such cases, the benefits of usingBrzozowski’s construction disappear, as we can expect noadvantage in constructing the minimal reverse DFA insteadof directly constructing the DFA for the formula.Note, furthermore, that even in those cases where the reverseDFA is indeed smaller, these points are far below the bluecurve, indicating that they are not exponentially smaller. Thisis a problem because, after being reversed again, the reverseDFA will go through a subset construction, which causesan exponential blowup. This is represented in the symbolicversion of Brzozowski’s algorithm by the number of statevariables in the symbolic representation being linear in thenumber of the states of the reverse DFA, instead of logarithmicin the number of states of the DFA. Therefore, unless thereverse DFA is exponentially smaller, the number of statevariables will be larger than in the symbolic representation ofthe explicitly-constructed minimal DFA. This in turn impactsthe performance of the winning-strategy computation. Theincrease in the state space during subset construction seems tobe the main reason for failures of the Brzozowski approach,as the reverse-DFA construction itself takes comparable time Figures best viewed on a computer. D F A Reverse DFA
Fig. 3: Number of states for reverse DFA and canonical DFAon
Random benchmarks. n i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - i m - - N u m be r o f e x p li c i t s t a t e s Reverse DFA DFA
Fig. 4: Number of states for reverse DFA and DFA on
Nim benchmarks.to the DFA construction and succeeds in almost all 1000benchmarks.Interestingly, although the symbolic state-space pruninghelps with the large state space during synthesis, significantlyreducing the size of the BDDs representing the sets of winningstates at each iteration, the computation of the set of reachablestates itself ends up consuming a majority of the running time.Therefore, it turns out to not be helpful in getting the symbolicversion of Brzozowski’s algorithm scale.The situation for the non-random benchmarks is even moreextreme. Figures 4, 5 and 6 show the comparison of DFA andreverse-DFA size for the
Nim , Single-Counter and
Double-Counters benchmarks, respectively. With the exception of afew of the smaller benchmarks in the
Nim family, in theseinstances the DFA is actually smaller than the reverse DFA.The results for the counter benchmarks in particular areuseful to better understand where our assumptions are violated,as we can observe the scalability of the automata in terms ofthe number of bits n in the counter, which is proportionalto the formula length. The plots (in log scale) show thatthe reverse DFA grows exponentially with n , which is thepredicted behavior. Yet, the DFA is exponential as well, ratherthan doubly-exponential.11 N u m be r o f e x p li c i t s t a t e s Number of bitsReverse DFA DFA
Fig. 5: Number of states for reverse DFA and DFA on
Single-Counter benchmarks. N u m be r o f e x p li c i t s t a t e s Number of bitsReverse DFA DFA
Fig. 6: Number of states for reverse DFA and DFA on
Double-Counters benchmarks.These results highlight an important detail that is easilyoverlooked in the justification for the reverse DFA constructionin Section III-A: the theoretical lower bounds on automatasizes refer to the worst-case . This means that even thoughthere are formulas for which the smallest DFA is doublyexponential, it might be that such cases occur very rarely. Ourexperimental results suggest that this might indeed be the case.This is consistent with previous results from [7], that canonicalDFA constructed from temporal formulas are often orders ofmagnitude smaller in practice than the corresponding NFA.Thus, even though the worst-case size of the reverse DFAis exponentially smaller, our conclusion is that in practice theworst case is not common enough for making an approachbased on constructing the reverse DFA worthwhile. Therefore,directly constructing the minimal DFA using Hopcroft’s algo-rithm seems to be a better option for synthesis than employingBrzozowski’s construction.A
PPENDIX
CPrevious research shows that the DFA for an LTL f formulacan be doubly-exponential to the size of the formula [6], theDFA for the reverse language of the formula is, however, guar-anteed to have size at most only exponential [11], [15]. Theseobservations indicate that there is a worst-case exponential gapbetween the two DFAs. Since the proofs in the literature areimplicit, here, we present a direct investigation towards them. In order to show the upper and lower bounds of theDFA size, we adopt the language from [6] expressed in thefollowing LTL f formula parameterized by m : KV ( m ) = F ( Xtrue ∧ (cid:86) ≤ i ≤ m ( p i ↔ F ( p i ∧ N f alse )))
Intuitively, this formula says that the truth assignment ofthe last timepoint in the trace must appear earlier in the trace.Therefore the automaton must remember every assignmentthat it sees until the end, requiring a doubly-exponential statespace. Based on this formula we can prove the followingtheorem showing the upper and lower bounds of the DFAsize of an LTL f formula. The proof is analogous to that ofTheorem 3.3 in [6]. Theorem 9.
Let φ be an LTL f formula of length n , the sizeof the DFA for φ is O ( n ) and Ω( n ) .Proof. Consider the formula KV ( m ) shown above. The upperbound of the DFA for φ follows from the linear translation ofLTL f formulas to alternating automata on words (AFW) [11]and the exponential blowup to NFA, which is then followedby another exponential blowup due to the subset constructionfrom NFA to DFA. We show the lower bound of the DFAsize following [6]. Since the standard DFA has to rememberall truth assignments it sees along the trace in order to checkthat the last truth assignment has appeared earlier, it must beof doubly-exponential size. Thus, the smallest DFA of φ has atleast m states, and the length n of φ is linear in m . Therefore,we conclude that for an LTL f formula of length n , the DFAfor φ has Ω( n ) states in the worst case.In contrast, the DFA for the reverse language of formula φ is in fact at most exponential in the size of the originalspecification [11], [15], as stated in the theorem below. Theorem 10.
Let φ be an LTL f formula of length n , the sizeof the Reverse DFA for φ is O ( n ) and Ω( n ) .Proof. Consider an LTL f formula φ of length n . We denotethe corresponding PastLTL formula of φ as φ R , the lengthof which is n as well. The DFA of φ R has at most O ( n ) states [19], and this DFA accepts the reverse of the wordssatisfying φ , following Theorem 1. For the lower bound, con-sider the LTL f formula KV ( m ) defined above, the smallestReverse DFA of φ has at least m states [6]. This is becausethe Reverse DFA just remembers the last truth assignment andchecks that it appears earlier in the trace, so it is of exponentialsize.Putting Theorem 9 and 10 together, it is clear that, givenLTL f formula φ , the DFA for the reverse language of φ canbe, in the worst-case, exponentially more succinct than theDFA for the original language of φ .In order to better visualize this exponential gap betweenReverse DFA and DFA, we constructed the DFA and ReverseDFA for the KV ( m ) formula given different values of m . Wename this class of benchmarks KV . The plot in Figure 7 showsthe number of states of each automaton as m grows. Note thatthe y-axis is in log scale. It is easy to see that the DFA grows12 N u m be r o f e x p li c i t s t a t e s Number of variables mDFA Reverse DFA
Fig. 7: Number of states for DFA and Reverse-DFA on KV benchmarks.doubly-exponentially with m while the Reverse DFA growsexponentially. Already for m = 4 the DFA is almost ××