[PDF] A Formalisation of Finite Automata using Hereditarily Finite Sets

Abstract

Hereditarily finite (HF) set theory provides a standard universe of sets, but with no infinite sets. Its utility is demonstrated through a formalisation of the theory of regular languages and finite automata, including the Myhill-Nerode theorem and Brzozowski's minimisation algorithm. The states of an automaton are HF sets, possibly constructed by product, sum, powerset and similar operations.

Full PDF

aa r X i v : . [ c s . F L ] M a y A Formalisation of Finite Automata usingHereditarily Finite Sets

Lawrence C. Paulson

Computer Laboratory, University of Cambridge, England [email protected]

Abstract.

Hereditarily ﬁnite (HF) set theory provides a standard uni-verse of sets, but with no inﬁnite sets. Its utility is demonstrated througha formalisation of the theory of regular languages and ﬁnite automata,including the Myhill-Nerode theorem and Brzozowski’s minimisation al-gorithm. The states of an automaton are HF sets, possibly constructedby product, sum, powerset and similar operations.

The theory of ﬁnite state machines is fundamental to computer science. It hasapplications to lexical analysis, hardware design and regular expression pat-tern matching. A regular language is one accepted by a ﬁnite state machine,or equivalently, one generated by a regular expression or a type-3 grammar [6].Researchers have been formalising this theory for nearly three decades.A critical question is how to represent the states of a machine. Automatatheory is developed using set-theoretic constructions, e.g. the product, disjointsum or powerset of sets of states. But in a strongly-typed formalism such ashigher-order logic (HOL), machines cannot be polymorphic in the type of states:statements such as “every regular language is accepted by a ﬁnite state machine”would require existential quantiﬁcation over types. One might conclude thatthere is no good way to formalise automata in HOL [5,15].It turns out that ﬁnite automata theory can be formalised within the theoryof hereditarily ﬁnite sets : set theory with the negation of the axiom of inﬁnity.It admits the usual constructions, including lists, functions and integers, but noinﬁnite sets. The type of HF sets can be constructed from the natural numberswithin higher-order logic. Using HF sets, we can retain the textbook deﬁnitions,without ugly numeric coding. We can expect HF sets to ﬁnd many other appli-cations when formalising theoretical computer science.The paper introduces HF set theory and automata (Sect. 2). It presents aformalisation of deterministic ﬁnite automata and results such as the Myhill-Nerode theorem (Sect. 3). It also treats nondeterministic ﬁnite automata andresults such as the powerset construction and closure under regular expressionoperations (Sect. 4). Next come minimal automata, their uniqueness up to iso-morphism, and Brzozowski’s algorithm for minimising an automaton [3] (Sect. 5).The paper concludes after discussing related work (Sect. 6–7). The proofs, whichare available online [12], also demonstrate the use of Isabelle’s locales [1].

Background An hereditarily ﬁnite set can be understood inductively as a ﬁnite set of hered-itarily ﬁnite sets [14]. This deﬁnition justiﬁes the recursive deﬁnition f ( x ) = P { f ( y ) | y ∈ x } , yielding a bijection f : HF → N between the HF sets and thenatural numbers. The linear ordering on HF given by x < y ⇐⇒ f ( x ) < f ( y )can be shown to extend both the membership and the subset relations.The HF sets support many standard constructions, even quotients. Equiva-lence classes are not available in general — they may be inﬁnite — but the linearordering over HF identiﬁes a unique representative. The integers and rationalscan be constructed, with their operations (but not the set of integers, obviously).´Swierczkowski [14] has used HF as the basis for proving G¨odel’s incompletenesstheorems, and I have formalised his work using Isabelle [13].Let Σ be a nonempty, ﬁnite alphabet of symbols . Then Σ ∗ is the set of words :ﬁnite sequences of symbols. The empty word is written ǫ , and the concatenationof words u and v is written uv . A deterministic ﬁnite automaton (DFA) [6,7] isa structure ( K, Σ, δ, q , F ) where K is a ﬁnite set of states, δ : K × Σ → K isthe next-state function, q ∈ K is the initial state and F ⊆ K is the set of ﬁnalor accepting states. The next-state function on symbols is extended to one onwords, δ ∗ : K × Σ ∗ → K such that δ ∗ ( q, ǫ ) = q , δ ∗ ( q, a ) = δ ( q, a ) for a ∈ Σ and δ ∗ ( q, uv ) = δ ∗ ( δ ∗ ( q, u ) , v ). The DFA accepts the string w if δ ∗ ( q , w ) ∈ F . A set L ⊆ Σ ∗ is a regular language if L is the set of strings accepted by some DFA.A nondeterministic ﬁnite automaton (NFA) is similar, but admits multipleexecution paths and accepts a string if one of them reaches a ﬁnal state. Formally,an NFA is a structure ( K, Σ, δ, Q , F ) where δ : K × Σ → P ( K ) is the next-state function, Q ⊆ K a set of initial states, the other components as above.The next-state function is extended to δ ∗ : P ( K ) × Σ ∗ → P ( K ) such that δ ∗ ( Q, ǫ ) = Q , δ ∗ ( Q, a ) = S q ∈ Q δ ( q, a ) for a ∈ Σ and δ ∗ ( Q, uv ) = δ ∗ ( δ ∗ ( Q, u ) , v ).An NFA accepts the string w provided δ ∗ ( q, w ) ∈ F for some q ∈ Q .The notion of NFA can be extended with ǫ -transitions, allowing “silent”transitions between states. Deﬁne the transition relation q a → q ′ for q ′ ∈ δ ( q, a ).Let the ǫ -transition relation q ǫ → q ′ be given. Then deﬁne the transition relation q a ⇒ q ′ to allow ǫ -transitions before and after: ( ǫ → ) ∗ ◦ ( a → ) ◦ ( ǫ → ) ∗ .Every NFA can be transformed into a DFA, where the set of states is thepowerset of the NFA’s states, and the next-state function captures the eﬀect of q a ⇒ q ′ on these sets of states. Regular languages are closed under intersectionand complement, therefore also under union. They are closed under repetition(Kleene star). Two key results are discussed below: – The Myhill-Nerode theorem gives necessary and suﬃcient conditions for alanguage to be regular. It deﬁnes a canonical and minimal DFA for any givenregular language. Minimal DFAs are unique up to isomorphism. – Reorienting the arrows of the transition relation transforms a DFA into anNFA accepting the reverse of the given language. We can regain a DFA usingthe powerset construction. Repeating this operation yields a minimal DFAfor the original language. This is Brzozowski’s minimisation algorithm [3].his work has been done using the proof assistant Isabelle/HOL. Documen-tation is available online at http://isabelle.in.tum.de/ . The work refers toequivalence relations and equivalence classes, following the conventions estab-lished in my earlier paper [11]. If R is an equivalence relation on the set A , then A//R is the set of equivalence classes. If x ∈ A , then its equivalence class is R‘‘ { x } .Formally, it is the image of x under R : the set of all y such that (x,y) ∈ R . Moregenerally, if X ⊆ A then R‘‘X is the union of the equivalence classes

R‘‘ { x } for x ∈ X . When adopting HF set theory, there is the question of whether to use it foreverything, or only where necessary. The set of states is ﬁnite, so it could bean HF set, and similarly for the set of ﬁnal states. The alphabet could also begiven by an HF set; then words—lists of symbols—would also be HF sets. Ourdeﬁnitions could be essentially typeless.The approach adopted here is less radical. It makes a minimal use of HF,allowing stronger type-checking, although this does cause complications else-where. Standard HOL sets (which are eﬀectively predicates) are intermixed withHF sets. An HF set has type hf , while a (possibly inﬁnite) set of HF sets hastype hf set . Deﬁnitions are polymorphic in the type ’a of alphabet symbols,while words have type ’a list . The record deﬁnition below declares the components of a DFA. The types makeit clear that there is indeed a set of states but only a single initial state, etc. record ’a dfa = states :: "hf set"init :: "hf"final :: "hf set"nxt :: "hf ⇒ ’a ⇒ hf" Now we package up the axioms of the DFA as a locale [1]: locale dfa = ﬁxes

M :: "’a dfa" assumes init: "init M ∈ states M" and final: "final M ⊆ states M" and nxt: " V q x. q ∈ states M = ⇒ nxt M q x ∈ states M" and finite: "finite (states M)" The last assumption is needed because the states ﬁeld has type hf set andnot hf . The locale bundles the assumptions above into a local context, wherethey are directly available. It is then easy to deﬁne the accepted language. primrec nextl :: "hf ⇒ ’a list ⇒ hf" where "nextl q [] = q"| "nextl q (x eﬁnition language :: "(’a list) set" where "language ≡ {xs. nextl (init M) xs ∈ final M}" Equivalence relations play a signiﬁcant role below. The following relation regardstwo strings as equivalent if they take the machine to the same state [7, p. 90]. deﬁnition eq nextl :: "(’a list × ’a list) set" where "eq nextl ≡ {(u,v). nextl (init M) u = nextl (init M) v}" Note that language and eq nextl take no arguments, but refer to the locale.

The Myhill-Nerode theorem asserts the equivalence of three characterisationsof regular languages. The ﬁrst of these is to be the language accepted by someDFA. The other two are connected with certain equivalence relations, calledMyhill-Nerode relations, on words of the language.The deﬁnitions below are outside of the locale and are therefore independentof any particular DFA. The predicate dfa refers to the locale axioms and ex-presses that its argument, M , is a DFA. The predicate dfa.language refers to theconstant language : outside of the locale, it takes a DFA as an argument. deﬁnition regular :: "(’a list) set ⇒ bool" where "regular L ≡ ∃ M. dfa M ∧ dfa.language M = L" The other characterisations of a regular language involve abstract ﬁnite statemachines derived from the language itself, with certain equivalence classes as thestates. A relation is right invariant if it satisﬁes the following closure property. deﬁnition right invariant :: "(’a list × ’a list) set ⇒ bool" where "right invariant r ≡ ( ∀ u v w. (u,v) ∈ r −→ (u@w, v@w) ∈ r)" The intuition is that if two words u and v are related, then each word brings the“machine” to the same state, and once this has happened, this agreement mustcontinue no matter how the words are extended as u@w and v@w .A Myhill-Nerode relation for a language L is a right invariant equivalence re-lation of ﬁnite index where L is the union of some of the equivalence classes[7, p. 90]. Finite index means the set of equivalence classes is ﬁnite: finite(UNIV//R) . The equivalence classes will be the states of a ﬁnite state machine.The equality

L = R‘‘A , where A ⊆ L is a set of words of the language, expresses L as the union of a set of equivalence classes, which will be the ﬁnal states. deﬁnition MyhillNerode :: "’a list set ⇒ (’a list * ’a list)set ⇒ bool" where "MyhillNerode L R ≡ equiv UNIV R ∧ right invariant R ∧ finite (UNIV//R) ∧ ( ∃ A. L = R‘‘A)"

While eq nextl (deﬁned in § eq app right is deﬁned in terms of a language, L . It relates the words u and v if all extensionsof them, u@w and v@w , behave equally with respect to L : UNIV denotes a typed universal set, here the set of all words. eﬁnition eq app right :: "’a list set ⇒ (’a list * ’a list) set" where "eq app right L ≡ {(u,v). ∀ w. u@w ∈ L ←→ v@w ∈ L}"

It is a Myhill-Nerode relation for L provided it is of ﬁnite index: lemma MN eq app right:"finite (UNIV // eq app right L) = ⇒ MyhillNerode L (eq app right L)"

Moreover, every Myhill-Nerode relation R for L reﬁnes eq app right L . lemma MN refines eq app right: "MyhillNerode L R = ⇒ R ⊆ eq app right L" This essentially states that eq app right L is the most abstract Myhill-Neroderelation for L . This will eventually yield a way of deﬁning a minimal machine. The Myhill-Nerode theorem says that these three statements are equivalent [6]:1. The set L is a regular language (is accepted by some DFA).2. There exists some Myhill-Nerode relation R for L .3. The relation eq app right L has ﬁnite index.We have (1) ⇒ (2) because eq nextl is a Myhill-Nerode relation. We have(2) ⇒ (3), by lemma MN refines eq app right , because every equivalence class for eq app right L is the union of equivalence classes of R , and so eq app right L hasminimal index for all Myhill-Nerode relations. We get (3) ⇒ (1) by constructinga DFA whose states are the (ﬁnitely many) equivalence classes of eq app rightL . This construction can be done for every Myhill-Nerode relation.Until now, all proofs have been routine. But now we face a diﬃculty: thestates of our machine should be equivalence classes of words, but these couldbe inﬁnite sets. What can be done? The solution adopted here is to map theequivalence classes to the natural numbers, which are easily embedded in HF.Proving that the set of equivalence classes is ﬁnite gives us such a map.Mapping inﬁnite sets to integers seems to call into question the very ideaof representing states by HF sets. However, mapping sets to integers turns outto be convenient only occasionally, and it is not necessary: we could formaliseDFAs diﬀerently, coding symbols (and therefore words) as HF sets. Then wecould represent states by representatives (having type hf ) of equivalence classes.Using Isabelle’s type-class system to identify the types (integers, booleans, lists,etc.) that can be embedded into HF, type ’a dfa could still be polymorphic inthe type of symbols. But the approach followed here is simpler. If R is a Myhill-Nerode relation for a language L , then the set of equivalenceclasses is ﬁnite and yields a DFA for L . The construction is packaged as a locale,which is used once in the proof of the Myhill-Nerode theorem, and again to provehat minimal DFAs are unique. The locale includes not only L and R , but alsothe set A of accepting states, the cardinality n and the bijection h between theset UNIV//R of equivalence classes and the number n as represented in HF. Thelocale assumes the Myhill-Nerode conditions. locale MyhillNerode dfa = ﬁxes

L :: "(’a list) set" and

R :: "(’a list * ’a list) set" and

A :: "(’a list) set" and n :: nat and h :: "(’a list) set ⇒ hf" assumes eqR: "equiv UNIV R" and riR: "right invariant R" and L: "L = R‘‘A" and h: "bij betw h (UNIV//R) (hfset (ord of n))"

The DFA is deﬁned within the locale. The states are given by the equivalenceclasses. The initial state is the equivalence class for the empty word; the set ofﬁnal states is derived from the set A of words that generate L ; the next-statefunction maps the equivalence class for the word u to that for u@[x] . Equiva-lence classes are not the actual states here, but are mapped to integers via thebijection h . As mentioned above, this use of integers is not essential. deﬁnition DFA :: "’a dfa" where "DFA = ( | states = h ‘ (UNIV//R),init = h (R ‘‘ {[]}),final = {h (R ‘‘ {u}) | u. u ∈ A},nxt = λ q x. h ( S u ∈ h − q. R ‘‘ {u@[x]}) | ) " This can be proved to be a DFA easily. One proof line, using the right-invarianceproperty and lemmas about quotients [11], proves that the next-state functionrespects the equivalence relation. Four more lines are needed to verify the proper-ties of a DFA, somewhat more to show that the language of this DFA is indeed L .The facts proved within the locale are summarised (outside its scope) by thefollowing theorem, stating that every Myhill-Nerode relation yields an equivalentDFA. (The obtains form expresses existential and multiple conclusions.) theorem MN imp dfa: assumes "MyhillNerode L R" obtains M where "dfa M" "dfa.language M = L""card (states M) = card (UNIV//R)" This completes the (3) ⇒ (1) stage, by far the hardest, of the Myhill-Nerodetheorem. The three stages are shown below. Lemma L2 3 includes a result aboutcardinality: the construction yields a minimal DFA, which will be useful later. lemma

L1 2: "regular L = ⇒ ∃ R. MyhillNerode L R" lemma

L2 3: assumes "MyhillNerode L R" obtains "finite (UNIV // eq app right L)""card (UNIV // eq app right L) ≤ card (UNIV // R)" lemma L3 1: "finite (UNIV // eq app right L) = ⇒ regular L" Nondeterministic Automata and Closure Proofs

As most of the proofs are simple, our focus will be the use of HF sets when deﬁn-ing automata. Our main example is the powerset construction for transforminga nondeterministic automaton into a deterministic one.

As in the deterministic case, a record holds the necessary components, while alocale encapsulates the axioms. Component eps deals with ǫ -transitions. record ’a nfa = states :: "hf set"init :: "hf set"final :: "hf set"nxt :: "hf ⇒ ’a ⇒ hf set"eps :: "(hf * hf) set" The axioms are obvious: the initial, ﬁnal and next states belong to the set ofstates, which is ﬁnite. An axiom restricting ǫ -transitions to machine states wasremoved, as it did not simplify proofs. Working with ǫ -transitions is messy. Ithelps to provide special treatment for NFAs having no ǫ -transitions. Allowingmultiple initial states reduces the need for ǫ -transitions. locale nfa = ﬁxes M :: "’a nfa" assumes init: "init M ⊆ states M" and final: "final M ⊆ states M" and nxt: " V q x. q ∈ states M = ⇒ nxt M q x ⊆ states M" and finite: "finite (states M)" The following function “closes up” a set Q of states under ǫ -transitions. Inter-section with states M conﬁnes these transitions to legal states. deﬁnition epsclo :: "hf set ⇒ hf set" where "epsclo Q ≡ states M ∩ ( S q ∈ Q. {q’. (q,q’) ∈ (eps M) ∗ })" The remaining deﬁnitions are straightforward. Note that nextl generalises nxt to take a set of states as well is a list of symbols. primrec nextl :: "hf set ⇒ ’a list ⇒ hf set" where "nextl Q [] = epsclo Q"| "nextl Q (x S q ∈ epsclo Q. nxt M q x) xs" deﬁnition language :: "(’a list) set" where "language ≡ {xs. nextl (init M) xs ∩ final M = {}}" The construction of a DFA to simulate a given NFA is elementary, and is a gooddemonstration of the HF sets. The strongly-typed approach used here requires apair of coercion functions hfset :: "hf ⇒ hf set" and HF :: "hf set ⇒ hf" to convert between HF sets and ordinary sets. emma HF hfset: "HF (hfset a) = a" lemma hfset HF: "finite A = ⇒ hfset (HF A) = A" With this approach, type-checking indicates whether we are dealing with a setof states or a single state. The drawback is that we occasionally have to showthat a set of states is ﬁnite in the course of reasoning about the coercions, whichwould never be necessary if we conﬁned our reasoning to the HF world.Here is the deﬁnition of the DFA. The states are ǫ -closed subsets of NFAstates, coerced to type hf . The initial and ﬁnal states are deﬁned similarly, whilethe next-state function requires both coercions and performs ǫ -closure beforeand after. We work in locale nfa , with access to the components of the NFA. deﬁnition Power dfa :: "’a dfa" where "Power dfa = ( | dfa.states = HF ‘ epsclo ‘ Pow (states M),init = HF(epsclo(init M)),final = {HF(epsclo Q) | Q. Q ⊆ states M ∧ Q ∩ final M = {}},nxt = λ Q x. HF( S q ∈ epsclo (hfset Q). epsclo (nxt M q x)) | ) " Proving that this is a DFA is trivial. The hardest case is to show that thenext-state function maps states to states. Proving that the two automata acceptthe same language is also simple, by reverse induction on lists (the inductionstep concerns u@[x] , putting x at the end). Here, Power.language refers to thelanguage of the powerset DFA, while language refers to that of the NFA. theorem

Power language: "Power.language = language"

The set of languages accepted by some DFA is closed under complement, inter-section, concatenation, repetition (Kleene star), etc. [6]. Consider intersection: theorem regular Int: assumes

S: "regular S" and

T: "regular T" shows "regular (S ∩ T)"

The recognising DFA is created by forming the Cartesian product of the sets ofstates of MS and MT , the DFAs of the two languages. The machines are eﬀectivelyrun in parallel. The decision to represent a set of states by type hf set ratherthan by type hf means we cannot write dfa.states MS × dfa.states MT , butwe can express this concept using set comprehension: " ( | states = { h q1,q2 i | q1 q2. q1 ∈ dfa.states MS ∧ q2 ∈ dfa.states MT},init = h dfa.init MS, dfa.init MT i ,final = { h q1,q2 i | q1 q2. q1 ∈ dfa.final MS ∧ q2 ∈ dfa.final MT},nxt = λ h qs,qt i x. h dfa.nxt MS qs x, dfa.nxt MT qt x i| ) " This is trivially shown to be a DFA. Showing that it accepts the intersection ofthe given languages is again easy by reverse induction.Closure under concatenation is expressed as follows: theorem regular conc: assumes

S: "regular S" and

T: "regular T" shows "regular (S @@ T)" he concatenation is recognised by an NFA involving the disjoint sum ofthe sets of states of MS and MT , the DFAs of the two languages. The eﬀect isto simulate the ﬁrst machine until it accepts a string, then to transition to asimulation of the second machine. There are ǫ -transitions linking every ﬁnalstate of MS to the initial state of MT . We again cannot write dfa.states MS +dfa.states MT , but we can express the disjoint sum naturally enough: " ( | states = Inl ‘ (dfa.states MS) ∪ Inr ‘ (dfa.states MT),init = {Inl (dfa.init MS)},final = Inr ‘ (dfa.final MT),nxt = λ q x. sum case ( λ qs. {Inl (dfa.nxt MS qs x)})( λ qt. {Inr (dfa.nxt MT qt x)}) q,eps = ( λ q. (Inl q, Inr (dfa.init MT))) ‘ dfa.final MS | ) " Again, it is trivial to show that this is an NFA. But unusually, proving that itrecognises the concatenation of the languages is a challenge. We need to show,by induction, that the “left part” of the NFA correctly simulates MS . have " V q. Inl q ∈ ST.nextl {Inl (dfa.init MS)} u ←→ q = (dfa.nextl MS (dfa.init MS) u)" The key property is that any string accepted by the NFA can be split into stringsaccepted by the two DFAs. The proof involves a fairly messy induction. have " V q. Inr q ∈ ST.nextl {Inl (dfa.init MS)} u ←→ ( ∃ uS uT. uS ∈ dfa.language MS ∧ u = uS@uT ∧ q = dfa.nextl MT (dfa.init MT) uT)" Closure under Kleene star is not presented here, as it involves no interestingset operations. The language L ∗ is recognised by an NFA with an extra state,which serves as the initial state and runs the DFA for L including iteration. Theproofs are messy, with many cases. To their credit, Hopcroft and Ullman [6] givesome details, while other authors content themselves with diagrams alone. Given a regular language L , the Myhill-Nerode theorem yields a DFA having theminimum number of states. But it does not yield a minimisation algorithm fora given automaton. It turns out that a DFA is minimal if it has no unreachablestates and if no two states are indistinguishable (in a sense made precise below).This again does not yield an algorithm. Brzozowski’s minimisation algorithm involves reversing the DFA to create an NFA, converting back to a DFA viapowersets, removing unreachable states, then repeating those steps to undo thereversal. Surprisingly, it performs well in practice [3].

The following developments are done within the locale dfa , and therefore referto one particular deterministic ﬁnite automaton.he left language of a state q is the set of all words w such that q w → ∗ q , orinformally, such that the machine when started in the initial state and given theword w ends up in q . In a DFA, the left languages of distinct states are disjoint,if they are nonempty. deﬁnition left lang :: "hf ⇒ (’a list) set" where "left lang q ≡ {u. nextl (init M) u = q}" The right language of a state q is the set of all words w such that q w → ∗ q f ,where q f is a ﬁnal state, or informally, such that the machine when started in q will accept the word w . The language of a DFA is the right language of q . Twostates having the same right language are indistinguishable : they both lead tothe same words being accepted. deﬁnition right lang :: "hf ⇒ (’a list) set" where "right lang q ≡ {u. nextl q u ∈ final M}" The accessible states are those that can be reached by at least one word. deﬁnition accessible :: "hf set" where "accessible ≡ {q. left lang q = {}}" The function path to returns one speciﬁc such word. This function will even-tually be used to express an isomorphism between any minimal DFA (one havingno inaccessible or indistinguishable states) and the canonical DFA determinedby the Myhill-Nerode theorem. deﬁnition path to :: "hf ⇒ ’a list" where "path to q ≡ SOME u. u ∈ left lang q" lemma nextl path to:"q ∈ accessible = ⇒ nextl (dfa.init M) (path to q) = q" First, we deal with the problem of inaccessible states. It is easy to restrictany DFA to one having only accessible states. deﬁnition

Accessible dfa :: "’a dfa" where "Accessible dfa = ( | dfa.states = accessible,init = init M,final = final M ∩ accessible,nxt = nxt M | ) " This construction is readily shown to be a DFA that agrees with the orig-inal in most respects. In particular, the two automata agree on left lang and right lang , and therefore on the language they accept: lemma

Accessible language: "Accessible.language = language"

We can now deﬁne a DFA to be minimal if all states are accessible and no twostates have the same right language. (The formula inj on right lang (dfa.statesM) expresses that the function right lang is injective on the set dfa.states M .) eﬁnition minimal where "minimal ≡ accessible = states M ∧ inj on right lang (dfa.states M)" Because we are working within the DFA locale, minimal is a constant referringto one particular automaton.

We can deal with indistinguishable states similarly, deﬁning a DFA in which theindistinguishable states are identiﬁed via equivalence classes. This is not partof Brzozowski’s minimisation algorithm, but it is interesting in its own right:the equivalence classes themselves are HF sets. We begin by declaring a relationstating that two states are equivalent if they have the same right language. deﬁnition eq right lang :: "(hf × hf) set" where "eq right lang ≡ {(u,v). u ∈ states M ∧ v ∈ states M ∧ right lang u = right lang v}" Trivially, this is an equivalence relation, and equivalence classes of states areﬁnite (there are only ﬁnitely many states). In the corresponding DFA, theseequivalence classes form the states, with the initial and ﬁnal states given by theequivalence classes for the corresponding states of the original DFA. As usual,the function HF is used to coerce a set of states to type hf . deﬁnition Collapse dfa :: "’a dfa" where "Collapse dfa = ( | dfa.states = HF ‘ (states M // eq right lang),init = HF (eq right lang ‘‘ {init M}),final = {HF (eq right lang ‘‘ {q}) | q. q ∈ final M},nxt = λ Q x. HF ( S q ∈ hfset Q. eq right lang ‘‘ {nxt M q x}) | ) " This is easily shown to be a DFA, and the next-state function respects the equiv-alence relation. Showing that it accepts the same language is straightforward. lemma ext language Collapse dfa:"u ∈ Collapse.language ←→ u ∈ language" The property minimal is true for machines having no inaccessible or indistin-guishable states. To prove that such a machine actually has a minimal numberof states is tricky. It can be shown to be isomorphic to the canonical machinefrom the Myhill-Nerode theorem, which indeed has a minimal number of states.Automata M and N are isomorphic if there exists a bijection h between theirstate sets that preserves their initial, ﬁnal and next states. This conception isnicely captured by a locale, taking the DFAs as parameters: locale dfa isomorphism = M: dfa M + N: dfa N for M :: "’a dfa" and

N :: "’a dfa" + ﬁxes h :: "hf ⇒ hf" ssumes h: "bij betw h (states M) (states N)" and init : "h (init M) = init N" and final: "h ‘ final M = final N" and nxt : " V q x. q ∈ states M = ⇒ h(nxt M q x) = nxt N (h q) x" With this concept at our disposal, we resume working within the locale dfa ,which is concerned with the automaton M . If no two states have the same rightlanguage, then there is a bijection between the accessible states (of M ) and theequivalence classes yielded by the relation eq app right language . lemma inj right lang imp eq app right index: assumes "inj on right lang (dfa.states M)" shows "bij betw ( λ q. eq app right language ‘‘ {path to q})accessible (UNIV // eq app right language)" This bijection maps the state q to eq app right language ‘‘ {path to q} . Everyelement of the quotient UNIV // eq app right language can be expressed in thisform. And therefore, the number of states in a minimal machine equals the indexof eq app right language . deﬁnition min states where "min states ≡ card (UNIV // eq app right language)" lemma minimal imp index eq app right:"minimal = ⇒ card(dfa.states M) = min states" In the proof of the Myhill-Nerode theorem, it emerged that this index wasthe minimum cardinality for any DFA accepting the given language. Any otherautomaton, M’ , accepting the same language cannot have fewer states. This the-orem justiﬁes the claim that minimal indeed characterises a minimal DFA. theorem minimal imp card states le:" [[ minimal; dfa M’; dfa.language M’ = language ]]= ⇒ card (dfa.states M) ≤ card (dfa.states M’)" Note that while the locale dfa gives us implicit access to one DFA, namely M , itis still possible to refer to other automata, as we see above.The minimal machine is unique up to isomorphism because every minimalmachine is isomorphic to the canonical Myhill-Nerode DFA. The construction ofa DFA from a Myhill-Nerode relation was packaged as a locale, and by applyingthis locale to the given language and the relation eq app right language , we cangenerate the instance we need. interpretation Canon:MyhillNerode dfa language "eq app right language"language min states index f

Here, index f denotes some bijection between the equivalence classes and theircardinality (as an HF ordinal). It exists (deﬁnition omitted) by the deﬁnitionof cardinality itself. It is the required isomorphism function between M and thecanonical DFA of Sect. 3.4, which is written Canon.DFA . eﬁnition iso :: "hf ⇒ hf" where "iso ≡ index f o ( λ q. eq app right language ‘‘ {path to q})" The isomorphism property is stated using locale dfa isomorphism . theorem minimal imp isomorphic to canonical: assumes minimal shows "dfa isomorphism M Canon.DFA iso" Verifying the isomorphism conditions requires delicate reasoning. Hopcroft andUllman’s proof [6, p. 29–30] provides just a few clues.

At the core of this minimisation algorithm is an NFA obtained by reversing allthe transitions of a given DFA, and exchanging the initial and ﬁnal states. deﬁnition

Reverse nfa :: "’a dfa ⇒ ’a nfa" where "Reverse nfa MS = ( | nfa.states = dfa.states MS,init = dfa.final MS,final = {dfa.init MS},nxt = λ q x. {p ∈ dfa.states MS. q = dfa.nxt MS p x},eps = {} | ) " This is easily shown to be an NFA that accepts the reverse of every word acceptedby the original DFA. Applying the powerset construction yields a new DFA thathas no indistinguishable states. The point is that the right language of a powersetstate is derived from the right languages of the constituent states of the reversalNFA [3]. Those, in turn, are the left languages of the original DFA, and theseare disjoint (since the original DFA has no inaccessible states, by assumption). lemma inj on right lang PR: assumes "dfa.states M = accessible" shows "inj on (dfa.right lang (nfa.Power dfa (Reverse nfa M)))(dfa.states (nfa.Power dfa (Reverse nfa M)))"

The following deﬁnitions abbreviate the steps of Brzozowski’s algorithm. abbreviation

APR :: "’x dfa ⇒ ’x dfa" where "APR X ≡ dfa.Accessible dfa (nfa.Power dfa (Reverse nfa X))" deﬁnition Brzozowski :: "’a dfa" where "Brzozowski ≡ APR (APR M)"

By the lemma proved just above, the

APR operation yields minimal DFAs. theorem minimal APR: assumes "dfa.states M = accessible" shows "dfa.minimal (APR M)"

Brzozowski’s minimisation algorithm is correct. The ﬁrst APR call reverses thelanguage and eliminates inaccessible states; the second call yields a minimalmachine for the original language. The proof uses the theorems just proved. heorem minimal Brzozowski: "dfa.minimal Brzozowski" unfolding

Brzozowski def proof (rule dfa.minimal APR) show "dfa (APR M)" by (simp add: dfa.dfa Accessible nfa.dfa Power nfa Reverse nfa) nextshow "dfa.states (APR M) = dfa.accessible (APR M)" by (simp add: dfa.Accessible accessible dfa.states Accessible dfanfa.dfa Power nfa Reverse nfa) qed There is a great body of prior work. One approach involves working construc-tively, in some sort of type theory. Constable’s group has formalised automata[4] in Nuprl, including the Myhill-Nerode theorem. Using type theory in the formof Coq and its Ssreﬂect library, Doczkal et al. [5] formalise much of the same ma-terial as the present paper. They omit ǫ -transitions and Brzozowski’s algorithmand add the pumping lemma and Kleene’s algorithm for translating a DFA toa regular expression. Their development is of a similar length, under 1400 lines,and they allow the states of a ﬁnite automaton to be given by any ﬁnite type. Ina substantial development, Braibant and Pous [2] have implemented a tactic forsolving equations in Kleene algebras by implementing eﬃcient ﬁnite automataalgorithms in Coq. They represent states by integers.An early example of regular expression theory formalised using higher-orderlogic (Isabelle/HOL) is Nipkow’s veriﬁed lexical analyser [9]. His automata arepolymorphic in the types of state and symbols. NFAs are included, with ǫ -transitions simulated by an alphabet extended with a dummy symbol.Recent Isabelle developments explicitly bypass automata theory. Wu et al.[15] prove the Myhill-Nerode theorem using regular expressions. This is a signif-icant feat, especially considering that the theorem’s underlying intuitions comefrom automata. Current work on regular expression equivalence [8,10] continuesto focus on regular expressions rather than ﬁnite automata.This paper describes not a project undertaken by a team, but a six-weekcase study by one person. Its successful outcome obviously reﬂects Isabelle’spowerful automation, but the key factor is the simplicity of the speciﬁcations.Finite automata cause complications in the prior work. The HF sets streamlinethe speciﬁcations and allow elementary set-theoretic reasoning. The theory of ﬁnite automata can be developed straightforwardly using higher-order logic and HF set theory. We can formalise the textbook proofs: there isno need to shun automata or use constructive type theories. HF set theory canbe seen as an abstract universe of computable objects, with many potentialpplications. One possibility is programming language semantics: using hf asthe type of values oﬀers open-ended possibilities, including integer, rational andﬂoating point numbers, ASCII characters, and data structures. Acknowledgements.

Christian Urban and Tobias Nipkow oﬀered advice, and sug-gested Brzozowski’s minimisation algorithm as an example. The referees madea variety of useful comments.

References

1. C. Ballarin. Locales: A module system for mathematical theories.

Journal ofAutomated Reasoning , 52(2):123–153, 2014.2. T. Braibant and D. Pous. Deciding Kleene algebras in Coq.

Logical Methods inComputer Science , 8(1), 2012.3. J. Champarnaud, A. Khorsi, and T. Parantho¨en. Split and join for minimizing:Brzozowski’s algorithm. In M. Bal´ık and M. Sim´anek, editors,

The Prague Stringol-ogy Conference , pages 96–104. Department of Computer Science and Engineering,Czech Technical University, 2002.4. R. L. Constable, P. B. Jackson, P. Naumov, and J. C. Uribe. Constructivelyformalizing automata theory. In G. D. Plotkin, C. Stirling, and M. Tofte, editors,

Proof, Language, and Interaction , pages 213–238. MIT Press, 2000.5. C. Doczkal, J.-O. Kaiser, and G. Smolka. A constructive theory of regular languagesin Coq. In G. Gonthier and M. Norrish, editors,

Certiﬁed Programs and Proofs ,LNCS 8307, pages 82–97. Springer, 2013.6. J. E. Hopcroft and J. D. Ullman.

Formal Languages and Their Relation to Au-tomata . Addison-Wesley, 1969.7. D. Kozen.

Automata and computability . Springer, New York, 1997.8. A. Krauss and T. Nipkow. Proof pearl: Regular expression equivalence and relationalgebra.

J. Autom. Reasoning , 49(1):95–106, 2012.9. T. Nipkow. Veriﬁed lexical analysis. In J. Grundy and M. Newey, editors,

TheoremProving in Higher Order Logics: TPHOLs ’98 , LNCS 1479, pages 1–15. Springer,1998. Invited lecture.10. T. Nipkow and D. Traytel. Uniﬁed decision procedures for regular expressionequivalence. In G. Klein and R. Gamboa, editors,

Interactive Theorem Proving— 5th International Conference, ITP 2014 , LNCS 8558, pages 450–466. Springer,2014.11. L. C. Paulson. Deﬁning functions on equivalence classes.

ACM Transactions onComputational Logic , 7(4):658–675, 2006.12. L. C. Paulson. Finite automata in hereditarily ﬁnite set theory.

Archive of For-mal Proofs , Feb. 2015. http://afp.sf.net/entries/Finite_Automata_HF.shtml ,Formal proof development.13. L. C. Paulson. A mechanised proof of G¨odel’s incompleteness theorems usingNominal Isabelle.

Journal of Automated Reasoning , 2015. In press. Availableonline at http://link.springer.com/article/10.1007%2Fs10817-015-9322-8 .14. S. ´Swierczkowski. Finite sets and G¨odel’s incomplete-ness theorems.

Dissertationes Mathematicae , 422:1–58, 2003. http://journals.impan.gov.pl/dm/Inf/422-0-1.html .15. C. Wu, X. Zhang, and C. Urban. A formalisation of the Myhill-Nerode theorembased on regular expressions.