Better Automata through Process Algebra
aa r X i v : . [ c s . F L ] F e b BETTER AUTOMATA THROUGH PROCESS ALGEBRA
RANCE CLEAVELANDDepartment of Computer Science, University of Maryland, College Park MD 20742 USA e-mail address : [email protected]
Abstract.
This paper shows how the use of Structural Operational Semantics (SOS) inthe style popularized by the process-algebra community can lead to a more succinct anduseful construction for building finite automata from regular expressions. Such construc-tions have been known for decades, and form the basis for the proofs of one direction ofKleene’s Theorem. The purpose of the new construction is, on the one hand, to show stu-dents how small automata can be constructed, without the need for empty transitions, andon the other hand to show how the construction method admits closure proofs of regularlanguages with respect to other operators as well. These results, while not theoreticallysurprising, point to an additional influence of process-algebraic research: in addition toproviding fundamental insights into the nature of concurrent computation, it also shedsnew light on old, well-known constructions in automata theory. Introduction
It is an honor to write this paper in celebration of Jos Baeten on the occasion of the publica-tion of his
Festschrift . I recall first becoming aware of Jos late in my PhD studies at CornellUniversity. Early in my doctoral career I had become independently interested in processalgebra, primarily through Robin Milner’s original monograph,
A Calculus of Communi-cating Systems [Mil80], and indeed wound up writing my dissertation on the topic. I wasworking largely on my own; apart from very stimulating interactions with Prakash Panan-gaden, who was at Cornell at the time, there were no researchers in the area at Cornell. Itwas in this milieu that I stumbled across the seminal papers by Jos’ colleagues, Jan Bergstraand Jan Willem Klop, describing the Algebra of Communicating Processes [BK84, BK85]. Iwas impressed with their classically algebraic approach, and their semantic accounts basedon graph constructions. This, together with Milner’s focus on operational semantics andthe Communicating Sequential Processes community’s on denotational semantics [BHR84],finally enabled me to truly understand the deep and satisfying links between operational, de-notational and axiomatic approaches to not only process algebra, but to program semanticsin general.While Jos was not a co-author of the two papers just cited, he was an early contributorto the process-algebraic field and has remained a prolific researcher in both theoretical and
Key words and phrases:
Process algebra; finite automata; regular expressions; operational semantics.Research supported by US Office of Naval Research Grant N000141712622.
LOGICAL METHODSIN COMPUTER SCIENCE DOI:10.2168/LMCS-??? c (cid:13)
R. CleavelandCreative Commons applied aspects of the discipline. I have followed his career, and admired his interest inboth foundational theory and practical applications of process theory, since completing myPhD in 1987. It is this broader view on the impact of process algebra that is the motivationfor this note. Indeed, I will not focus so much on new theoretical results, satisfying thoughthey can be. Rather, I want recount a story about my usage of process-algebra-inspiredtechniques to redevelop part of an undergraduate course on automata theory that I taughtfor a number of years. Specifically, I will discuss how I have used the Structural OperationalSemantics (SOS) techniques used extensively in process algebra to present what I havefound to be more satisfying ways than those typically covered in textbooks to constructfinite automata from regular expressions. Such constructions constitute a proof of one halfof Kleene’s Theorem [Kle56], which asserts a correspondence between regular languages andthose accepted by finite automata.In the rest of this paper I present the construction and contrast it to the constructionsfound in classical automata-theory textbooks such as [HMU06], explaining why I find thework presented here preferable from a pedagogical point of view. I also briefly situatethe work in the setting of an efficient technique [BS86] used in practice for convertingregular expressions to finite automata. The messsage I hope to convey is that in additionto contributing foundational understanding to notions of concurrent computation, processalgebra can also cast new light on well-understood automaton constructions as well, andthat pioneers in process algebra, such as Jos Baeten, are doubly deserving of the accoladesthey receive from the research community.2.
Alphabets, Languages, Regular Expressions and Automata
This section reviews the definitions and notation used later in this note for formal languages,regular expressions and finite automata. In the interest of succinctness the definitions departslightly from those found in automata-theory textbooks, although notationally I try to followthe conventions used in those books.2.1.
Alphabets and Languages.
At their most foundational level digital computers aredevices for computing with symbols. Alphabets and languages formalize this intuitionmathematically.
Definition 2.1 (Alphabet, word) . (1) An alphabet is a finite non-empty set Σ of symbols.(2) A word over alphabet Σ is a finite sequence a . . . a k of elements from Σ. We saythat k is the length of w in this case. If k = 0 we say w is empty ; we write ε forthe (unique) empty word over Σ. Note that every a ∈ Σ is also a (length-one) wordover Σ. We write Σ ∗ for the set of all words over Σ.(3) If w = a . . . a k and w = b . . . b ℓ are words over Σ then the concatenation , w · w ,of w and w is the word a . . . a k b . . . b n . Note that w · ε = ε · w = w for any word w . We often omit · and write w w for the concatenation of w and w .(4) A language L over alphabet Σ is a subset of Σ ∗ . The set of all languages over Σis the set of all subsets of Σ ∗ , and is written 2 Σ ∗ following standard mathematicalconventions. ETTER AUTOMATA THROUGH PROCESS ALGEBRA 3
Since languages over Σ ∗ are sets, general set-theoretic operations, including ∪ (union), ∩ (intersection) and − (set difference) may be applied to them. Other, language-specificoperations may also be defined. Definition 2.2 (Language concatenation, Kleene closure) . Let Σ be an alphabet.(1) Let L , L ⊆ Σ ∗ be languages over Σ. Then the concentation , L · L , of L and L is defined as follows. L · L = { w · w | w ∈ L and w ∈ L } (2) Let L ⊆ Σ ∗ be a language over Σ. Then the Kleene closure , L ∗ , of L is definedinductively as follows. • ε ∈ L ∗ • If w ∈ L and w ∈ L ∗ then w · w ∈ L ∗ .2.2. Regular Expressions.
Regular expressions provide a notation for defining languages.
Definition 2.3 (Regular expression) . Let Σ be an alphabet. Then the set, R (Σ), of regularexpressions over Σ is defined inductively as follows. • ∅ ∈ R (Σ). • ε ∈ R (Σ). • If a ∈ Σ then a ∈ R (Σ). • If r ∈ R (Σ) and r ∈ R (Σ) then r + r ∈ R (Σ) and r · r ∈ R (Σ). • If r ∈ R (Σ) then r ∗ ∈ R (Σ).It should be noted that R (Σ) is a set of expressions; the occurrences of ∅ , ε, + , · and ∗ aresymbols that do not innately possess any meaning, but must instead be given a semantics.This is done by interpreting regular expressions mathematically as languages. The formaldefinition takes the form of a function, L ∈ R (Σ) → Σ ∗ assigning a language L ( r ) ⊆ Σ ∗ toregular expression r . Definition 2.4 (Language of a regular expression, regular language) . Let Σ be an alphabet,and r ∈ R (Σ) a regular expression over Σ. Then the language , L ( r ) ⊆ Σ ∗ , associated with r is defined inductively as follows. L ( r ) = ∅ if r = ∅{ ε } if r = ε { a } if r = a and a ∈ Σ L ( r ) ∪ L ( r ) if r = r + r L ( r ) · L ( r ) if r = r · r ( L ( r ′ )) ∗ if r = ( r ′ ) ∗ A language L ⊆ Σ ∗ is regular if and only if there is a regular expression r ∈ R (Σ) such that L ( r ) = L . Textbooks typically define L ∗ differently, by first introducing L i for i ≥ L ∗ = S ∞ i =0 L i R. CLEAVELAND
Finite Automata.
Traditional accounts of finite automata typically introduce threevariations of the notion: deterministic (DFA), nondeterministic (NFA), and nondetermin-istic with ε -transitions (NFA- ε ). I will do the same, although I will do so in a somewhatdifferent order than is typical. Definition 2.5 (Nondeterministic Finite Automaton (NFA)) . A nondeterministic finiteautomata (NFA) is a tuple ( Q, Σ , δ, q I , F ), where: • Q is a finite non-empty set of states ; • Σ is an alphabet ; • δ ⊆ Q × Σ × Q is the transition relation ; • q I ∈ Q is the initial state ; and • F ⊆ Q is the set of accepting , or final , states.This definition of NFA differs slightly from e.g. [HMU06] in that δ is given as relationrather than function in Q × Σ → Q . It also defines the form of a NFA but not the sense inwhich it is indeed a machine for processing words in a language. The next definition doesthis by associating a language L ( M ) with a given NFA M = ( Q, Σ , δ, q I , F ). Definition 2.6 (Language of a NFA) . Let M = ( Q, Σ , δ, q I , F ) be a NFA.(1) Let q ∈ Q be a state of M and w ∈ Σ ∗ be a word over Σ. Then M accepts w from q if and only if one of the following holds. • w = ε and q ∈ F ; or • w = aw ′ some a ∈ Σ and w ′ ∈ Σ ∗ , and there exists ( q, a, q ′ ) ∈ δ such that M accepts w ′ from q ′ .(2) The language , L ( M ), accepted by M is defined as follows. L ( M ) = { w ∈ Σ ∗ | M accepts w from q I } Deterministic Finite Automata (DFAs) constitute a subclass of NFAs whose transitionrelation is deterministic, in a precisely defined sense.
Definition 2.7 (Deterministic Finite Automaton (DFA)) . NFA M = ( Q, Σ , δ, q I , F ) is a deterministic finite automaton (DFA) if and only if δ satisfies the following: for every q ∈ Q and a ∈ Σ, there exists exactly one q ′ such that ( q, a, q ′ ) ∈ δ .Since DFAs are NFAs the definition of L in Definition 2.6 is directly applicable to themas well. NFAs with ǫ -transitions are now defined as follows. Definition 2.8 (NFAs with ε -Transitions) . A nondeterministic automaton with ε -transitions (NFA- ε ) is a tuple ( Q, Σ , δ, q I , F ), where: • Q is a nonempty finite set of states ; • Σ is an alphabet , with ε Σ; • δ ⊆ Q × (Σ ∪ { ε } ) × Q is the transition relation ; • q I ∈ Q is the initial state ; and • F is the set of accepting , or final , states.An NFA- ε is like a NFA except that some transitions can be labeled with the emptystring ε rather than a symbol from Σ. The intution is that a transition of form ( q, ε, q ′ ) canoccur without consuming any symbol as an input. Formalizing this intuition, and defining L ( M ) for NFA- ε , may be done as follows. Definition 2.9 (Language of a NFA- ε ) . Let M = ( Q, Σ , δ, q I , F ) be a NFA- ε . ETTER AUTOMATA THROUGH PROCESS ALGEBRA 5 (1) Let q ∈ Q and w ∈ Σ ∗ . Then M accepts w from q if and only if one of the followingholds. • w = ε and q ′ ∈ F ; or • w = aw ′ for some a ∈ Σ and w ′ ∈ Σ ∗ and there exists q ′ ∈ Q such that( q, a, q ′ ) ∈ δ and M accepts w ′ from q ′ ; or • there exists q ′ ∈ Q such that ( q, ε, q ′ ) ∈ δ and M accepts w from q ′ .(2) The language , L ( M ), accepted by M is defined as follows. L ( M ) = { w ∈ Σ ∗ | M accepts w from q I } Defining the language of a NFA- ε requires redefining the notion of a machine acceptinga string from state q as given in the definition of the language of a NFA. This redefinitionreflects the essential difference between ε -transitions and those labeled by alphabet symbols.The three types of automata have differences in form, but equivalent expressive power.It should first be noted that, just as every DFA is already a NFA, every NFA is also aNFA- ε , namely, a NFA- ε with no ε -transitions. Thus, every language accepted by someDFA is also accepted by some NFA, and every language accepted by some NFA is acceptedby some NFA- ε . The next theorem establishes the converses of these implications. Theorem 2.10 (Equivalence of DFAs, NFAs and NFA- ε s) . (1) Let M be a NFA. Then there is a DFA D ( M ) such that L ( D ( M )) = L ( M ) . (2) Let M be a NFA- ε . Then there is a NFA N ( M ) such that L ( N ( M )) = L ( M ) .Proof. The proof of Case (1) involves the well-known subset construction, whereby eachsubset of states in M is associated with a single state in D ( M ). The proof of Case (2)typically relies on defining the ε closure of a set of states, namely, the set of states reachablefrom the given set via a sequence of zero or more ε -transitions. This notion is used to definethe transition relation of N ( M ) as well as its set of accepting states.3. Kleene’s Theorem
Given the definitions in the previous section it is now possible to state Kleene’s Theoremsuccinctly.
Theorem 3.1 (Kleene’s Theorem) . Let Σ be an alphabet. Then L ⊆ Σ ∗ is regular if andonly if there is a DFA M such that L ( M ) = L . The proof of this theorem is usually split into two pieces. The first involves showingthat for any regular expression r , there is a finite automaton M (DFA, NFA or NFA- ε )such that L ( M ) = L ( r ). Theorem 2.10 then ensures that the resulting finite automaton,if it is not already a DFA, can be converted into one in a language-preserving manner.The second shows how to convert a DFA M into a regular expression r in such a way that L ( r ) = L ( M ); there are several algorithms for this in the literature, including the classicdynamic-programming-based method of Kleene [Kle56] and equation-solving methods thatrely on Arden’s Lemma [Ard61].From a practical standpoint, the conversion of regular expressions to finite automatais the more important, since regular expressions are textual and are used consequently asthe basis for string search and processing. For this reason, I believe that teaching thisconstruction is especially keyin automata-theory classes, and this where my complaint withthe approaches in traditional automata-theory texts originates. R. CLEAVELAND
To understand the basis for my dissatisfaction, let us review the construction presentedin [HMU06], which explains how to convert regular expression r into NFA- ε M r in such a waythat L ( r ) = L ( M r ). The method is based on the construction due to Ken Thompson [Tho68]and produces NFA- ε M r with the following properties. • The initial state q I has no incoming transitions: that is, there exists no ( q, α, q I ) ∈ δ . • There is a single accepting state q F , and q F has no outgoing transitions: that is, F = { q F } , and there exists no ( q F , α, q ′ ) ∈ δ .The approach proceeds inductively on the structure of r . For example, if r = ( r ′ ) ∗ , thenassume that M r ′ = ( Q, Σ , δ, q I , { q F } ) meeting the above constraints has been constructed.Then M r is built as follows. First, let q ′ I Q and q ′ F Q be new states. Then M r =( Q ∪ { q ′ I , q ′ F } , Σ , δ ′ , { q ′ F } ), where δ ′ = δ ∪ { ( q ′ I , ε, q I ) , ( q ′ I , ε, q ′ F ) , ( q F , ε, q I ) , ( q F , ε, q ′ F ) } . It can be shown that M r satisfies the requisite properties and that L ( M r ) = ( L ( r ′ )) ∗ .Mathematically, the construction of M r is wholly satisfactory: it has the required prop-erties and can be defined relatively easily, albeit at the cost of introducing new states andtransitions. The proof of correctness is perhaps somewhat complicated, owing to the defini-tion of L ( M ) and the subtlety of ε -transitions, but it does acquaint students with definitionsvia structural induction on regular expressions.My concern with the construction, however, is several-fold. On the one hand, it doesrequire the introduction of the notion of NFA- ε , which is indeed more complex that that ofNFA. In particular, the definition of acceptance requires allowing transitions that consumeno symbol in the input word. On the other hand, the accretion of the introduction ofnew states at each state in the construction makes it difficult to test students on theirunderstanding of the construction in an exam setting. Specifically, even for relatively smallregular expressions the literal application of the construction yields automata with too manystates and transitions to be doable during the typical one-hour midterm exam for which USstudents would be tested on the material. Finally, the construction bears no resemblanceto algorithms used in practice for construction finite automata from regular expressions. Inparticular routines such as the Berry-Sethi procedure [BS86] construct DFAs directly fromregular expressions, completely avoiding the need for NFA- ε s, or indeed NFAs, altogether.The Berry-Sethi procedure is subtle and elegant, and relies on concepts, such as Br-zozowski derivatives [Brz64], that I would view as too specialized for an undergraduatecourse on automata theory. Consequently, I would not be in favor of covering them in anundergraduate classroom setting. Instead, in the next section I give a technique, based onoperational semantics in process algebra, for construction NFAs from regular expressions.The resulting NFAs are small enough for students to construct during exams, and the con-struction has other properties, including the capacity for introducing other operations thatpreserve regularity, that are pedagogically useful.4. NFAs via Structural Operational Semantics
This section describes an approach based on
Structural Operational Semantics (SOS) [Plo81,Plo04] for constructing NFAs from regular expressions. Specifically, I will define a (small-step) operational semantics for regular expressions on the basis of the structure of regularexpressions, and use the semantics to construct the requisite NFAs. The construction
ETTER AUTOMATA THROUGH PROCESS ALGEBRA 7 requires no ε -transitions and yields automata with at most one more state state than thesize of the regular expression from which they are derived.Following the conventions in the other parts of this paper I give the SOS rules usingnotation typically found in automata-theory texts. In particular, the SOS specification isgiven in natural language, as a collection of if-then statements, and not via inference rules.I use this approach in the classroom to avoid having to introduce notations for inferencerules. In the appendix I give the more traditional SOS presentation.4.1. An Operational Semantics for Regular Expressions.
In what follows fix alpha-bet Σ. The basis for the operational semantics of regular expressions consists of a relation, −→⊆ R (Σ) × Σ × R (Σ), and a predicate √ ⊆ R (Σ). In what follows I will write r a −→ r ′ and r √ in lieu of ( r, a, r ′ ) ∈ −→ and r ∈ √ . The intuitions are as follows.(1) r √ is intended to hold if and only if ε ∈ L ( r ). This is used in defining acceptingstates.(2) r a −→ r ′ is intended to reflect the following about L ( r ): one way to build a word in L ( r ) is to start with a ∈ Σ and then finish it with a word from L ( r ′ ).Using these relations, I then show how to build a NFA from r whose states are regularexpressions, whose transitions are given by −→ , and whose final states are defined using √ . Defining √ and −→ . We now define √ . Definition 4.1 (Definition of √ ) . Predicate r √ is defined inductively on the structure of r ∈ R (Σ) as follows. • If r = ε then r √ . • If r = ( r ′ ) ∗ for some r ′ ∈ R (Σ) then r √ . • If r = r + r for some r , r ∈ R (Σ), and r √ , then r √ . • If r = r + r for some r , r ∈ R (Σ), and r √ , then r √ . • If r = r · r for some r , r ∈ R (Σ), and r √ and r √ , then r √ .From the definition, one can see it is not the case that ∅√ or a √ , for any a ∈ Σ,while both ε √ and r ∗ √ always. This accords with the definition of L ( r ); ε
6∈ L ( ∅ ) = ∅ , and ε
6∈ L ( a ) = { a } , while ε ∈ L ( ε ) = { ε } and ε ∈ L ∗ for any language L ⊆ Σ ∗ , and in particularfor L = L ( r ) for regular expression r . The other cases in the definition reflect the fact that ε ∈ L ( r + r ) can only hold if ε ∈ L ( r ) or ε ∈ L ( r ), since + is interpreted as set union,and that ε ∈ L ( r · r ) can only be true if ε ∈ L ( r ) and ε ∈ L ( r ), since regular-expressionoperator · is interpreted as language concatenation. We have the following examples.( ε · a ∗ ) √ since ε √ and a ∗ √ . ¬ ( a + b ) √ since neither a √ nor b √ .(01 + (1 + 01) ∗ ) √ since (1 + 01) ∗ √ . ¬ (01(1 + 01) ∗ ) √ since ¬ (01) √ .We also use structural induction to define −→ . Definition 4.2 (Definition of −→ ) . Relation r a −→ r ′ , where r, r ′ ∈ R (Σ) and a ∈ Σ, is definedinductively on r . • If r = a and a ∈ Σ then r a −→ ε . • If r = r + r and r a −→ r ′ then r a −→ r ′ . • If r = r + r and r a −→ r ′ then r a −→ r ′ . R. CLEAVELAND • If r = r · r and r a −→ r ′ then r a −→ r ′ · r . • If r = r · r , r √ and r a −→ r ′ then r a −→ r ′ . • If r = ( r ′ ) ∗ and r ′ a −→ r ′′ then r a −→ r ′′ · ( r ′ ) ∗ .The definition of this relation is somewhat complex, but the idea that it is trying tocapture is relatively simple: r a −→ r ′ if one can build words in L ( r ) by taking the a labeling −→ and appending a word from L ( r ′ ). So we have the rule a a −→ ε for a ∈ Σ, while the rulesfor + follow from the fact that L ( r + r ) = L ( r ) ∪ L ( r ). The cases for r · r in essencestate that aw ∈ L ( r · r ) can hold either if there is a way of splitting w into w and w suchthat aw is in the language of r and w is in the language of r , or if ε is in the language of r and aw is in the language of r . Finally, the rule for ( r ′ ) ∗ essentially permits “looping”.As examples, we have the following. a + b a −→ ε by the rules for a and +.( abb + a ) ∗ a −→ εbb ( abb + a ) ∗ by the rules for a , · , +, and ∗ .In this latter example, note that applying the definition literally requires the inclusion of the ε in εbb ( abb + a ) ∗ . This is because the case for a says that a a −→ ε , meaning that abb a −→ εbb ,etc. However, when there are leading instances of ε like this, I will sometimes leave themout, and write abb a −→ bb rather than abb a −→ εbb . The following lemmas about √ and −→ formally establish the intuitive properties thatthey should have. Lemma 4.3.
Let r ∈ R (Σ) be a regular expression. Then r √ if and only if ε ∈ L ( r ) .Proof. The proof proceeds by structural induction on r . Most cases are left to the reader;we only consider the r = r · r case here. The induction hypothesis states that r √ if andonly if ε ∈ L ( r ) and r √ if and only if ε ∈ L ( r ). One reasons as follows. r √ iff r √ and r √ Definition of √ iff ε ∈ L ( r ) and ε ∈ L ( r ) Induction hypothesisiff ε ∈ ( L ( r )) · ( L ( r )) Property of concatenationiff ε ∈ L ( r · r ) Definition of L ( r · r )iff ε ∈ L ( r ) r = r · r Lemma 4.4.
Let r ∈ R (Σ) , a ∈ Σ , and w ∈ Σ ∗ . Then aw ∈ L ( r ) if and only if there is an r ′ ∈ R (Σ) such that r a −→ r ′ and w ∈ L ( r ′ ) .Proof. The proof proceeds by structural induction on r . We only consider the case r = ( r ′ ) ∗ in detail; the others are left to the reader. The induction hypothesis asserts that for all a and w ′ , aw ′ ∈ L ( r ′ ) if and only if there is an r ′′ such that r ′ a −→ r ′′ and w ′ ∈ L ( r ′′ ). We This convention can be formalized by introducing a special case in the definition of −→ for a · r anddistinguishing the current two cases for r · r to apply only when r Σ . ETTER AUTOMATA THROUGH PROCESS ALGEBRA 9 reason as follows. aw ∈ L ( r ) iff aw ∈ L (( r ′ ) ∗ ) r = ( r ′ ) ∗ iff aw ∈ ( L ( r ′ )) ∗ Definition of L (( r ′ ) ∗ )iff aw = w · w some w ∈ L ( r ′ ) , w ∈ ( L ( r ′ )) ∗ Definition of Kleene closureiff w = a · w ′ some w ′ Property of Kleene closureiff r ′ a −→ r ′′ some r ′′ with w ′ ∈ L ( r ′′ ) Induction hypothesisiff r a −→ r ′′ · ( r ′ ) ∗ Definition of −→ iff w ′ · w ∈ L ( r ′′ ) · L (( r ′ ) ∗ ) Definition of concatenationiff w ′ · w ∈ L ( r ′′ · ( r ′ ) ∗ ) Definition of L ( r ′′ · ( r ′ ) ∗ )iff r a −→ r ′′ · ( r ′ ) ∗ and w ∈ L ( r ′′ · ( r ′ ) ∗ ) w = w ′ · w Appendix A contains definitions of √ and −→ in the more usual inference-rule style usedin SOS specifications.4.2. Building Automata using √ and −→ . That √ and −→ may be used to build NFAsderives from how they may be used to determine whether a string is in the language of aregular expression. Consider the following sequence of transitions starting from the regularexpression ( abb + a ) ∗ .( abb + a ) ∗ a −→ bb ( abb + a ) ∗ b −→ b ( abb + a ) ∗ b −→ ( abb + a ) ∗ a −→ ( abb + a ) ∗ Using Lemma 4.4 four times, we can conclude that if w ∈ L (( abb + a ) ∗ ), then abba · w ∈ L (( abb + a ) ∗ ) also. In addition, since ( abb + a ) ∗ √ , it follows from Lemma 4.3 that ε ∈ L (( abb + a ) ∗ ). Since abba · ε = abba , it follows that abba ∈ L (( abb + a ) ∗ ).More generally, if there is a sequence of transitions r a −→ r · · · a n −→ r n and r n √ , thenit follows that a . . . a n ∈ L ( r ), and vice versa. This observation suggests the followingstrategy for building a NFA from a regular expression r .(1) Let the states be all possible regular expressions that can be reached by some se-quence of transitions from r .(2) Take r to be the start state.(3) Let the transitions be given by −→ .(4) Let the accepting states be those regular expressions r ′ reachable from r for which r ′ √ holds.Of course, this construction is only valid if the set of all possible regular expressions men-tioned in Step (1) is finite, since NFAs are required to have a finite number of states. Infact, a stronger result can be proved. First, recall the definition of the size, | r | , of regularexpression r . Definition 4.5 (Size of a regular expression) . The size, | r | , of r ∈ R (Σ) is defined induc-tively as follows. | r | = r = ε, r = ∅ , or r = a for some a ∈ Σ | r ′ | + 1 if r = ( r ′ ) ∗ | r | + | r | + 1 if r = r + r or r = r · r Intuitively, | r | counts the number of regular-expression operators in r . The reachability set of regular expression r can now be defined in the usual manner. Definition 4.6.
Let r ∈ R (Σ) be a regular expression. Then the set RS ( r ) ⊆ R (Σ) ofregular expressions reachable from r is defined recursively as follows. • r ∈ RS ( r ). • If r ∈ RS ( r ) and r a −→ r for some a ∈ Σ, then r ∈ RS ( r ).As an example, note that | ( abb + a ) ∗ | = 8 and that RS (( abb + a ) ∗ ) = { ( abb + a ) ∗ , εbb ( abb + a ) ∗ , εb ( abb + a ) ∗ , ε ( abb + a ) ∗ } , (In this case I have not applied my heuristic of suppressing leading ε expressions.) Thefollowing can now be provd. Theorem 4.7.
Let r ∈ R (Σ) be a regular expression. Then | RS ( r ) | ≤ | r | + 1 .Proof. The proof proceeds by structural induction on r . There are six cases to consider. r = ∅ : In this case RS ( r ) = {∅} , and | RS ( r ) | = 1 = | r | < | r | + 1. r = ε : In this case RS ( r ) = { ε } , and | RS ( r ) | = 1 = | r | < | r | + 1. r = a for some a ∈ Σ : In this case RS ( r ) = { a, ε } , and | RS ( r ) | = 2 = | r | + 1. r = r + r : In this case, RS ( r ) ⊆ RS ( r ) ∪ RS ( r ), and the induction hypothesisguarantees that | RS ( r ) | ≤ | r | + 1 and RS ( r ) ≤ | r | + 1. It then follows that | RS ( r ) | ≤ | RS ( r ) | + | RS ( r ) | ≤ | r | + | r | + 2 = | r | + 1 .r = r · r : In this case it can be shown that RS ( r ) ⊆ { r ′ · r | r ′ ∈ RS ( r ) } ∪ RS ( r ).Since |{ r ′ · r | r ′ ∈ RS ( r ) }| = | RS ( r ) | , similar reasoning as in the + case applies. r = ( r ′ ) ∗ : In this case we have that RS ( r ) ⊆ { r } ∪ { r ′′ ; r | r ′′ ∈ RS ( r ′ ) } . Thus | RS ( r ) | ≤ | RS ( r ′ ) | + 1 ≤ | r ′ | + 2 = | r | + 1 . This result shows not only that the sketched NFA construction given above yields afinite number of states for given r , it in fact establishes that this set of state is no largerthan | r | + 1. This highlights one of the main reasons I opted to introduce this constructionin my classes: small regular expressions yield NFAs that are almost as small, and can beconstructed manually in an exam setting.We can now formally define the construction of NFA M r from regular expression r asfollows. Definition 4.8.
Let r ∈ R (Σ) be a regular expression. Then M r = ( Q, Σ , q I , δ, A ) is theNFA defined as follows. • Q = RS ( r ). • q I = r . • δ = { ( r , a, r ) | r a −→ r } . • F = { r ′ ∈ Q | r ′ √} .The next theorem establishes that r and M r define the same languages. Theorem 4.9.
Let r ∈ R (Σ) be a regular expression. The L ( r ) = L ( M r ) .Proof. Relies on the fact that Lemmas 4.3 and 4.4 guarantee that w = a . . . a n ∈ L ( r ) ifand only if there is a regular expression r ′ such that r a −→ · · · a n −→ r ′ and r ′ √ . ETTER AUTOMATA THROUGH PROCESS ALGEBRA 11 out ( r ) = ∅ if r = ∅ or r = ε { ( a, ε ) } if r = a ∈ Σ out ( r ) ∪ out ( r ) if r = r + r { ( a, r ′ · r ) | ( a, r ′ ) ∈ out ( r ) }∪ { ( a, r ′ ) | ( a, r ′ ) ∈ out ( r ) ∧ r √} if r = r · r { ( a, r ′′ · ( r ′ ) ∗ ) | ( a, r ′ ) ∈ out ( r ) } if r = ( r ′ ) ∗ Figure 1: Calculating the outgoing transitions of regular expressions.4.3.
Computing M r . This section gives a routine for computing M r . It intertwines thecomputation of the reachability set from regular expression r with the updating of thetransition relation and set of accepting states. It relies on the computation of the so-called outgoing transitions of r ; these are defined as follows. Definition 4.10.
Let r ∈ R (Σ) be a regular expression. Then the set of outgoing transitions from r is defined as the set { ( a, r ′ ) | r a −→ r ′ } .The outgoing transitions from r consists of pairs ( a, r ′ ) that, when combined with r ,constitute a valid transition r a −→ r ′ . Figure 1 defines a recursive function, out , for computingthe outgoing transitions of r . The routine uses the structure of r and the definition of −→ toguide its computation. For regular expressions of the form ∅ , ε and a ∈ Σ, the definition of −→ in Definition 4.2 immediately gives all the transitions. For regular expressions built using + , · and ∗ , one must first recursively compute the outgoing transitions of the subexpressions of r and then combine the results appropriately, based on the cases given in the Definition 4.2.The next lemma states that out ( r ) correctly computes the outgoing transitions of r . Lemma 4.11.
Let r ∈ R (Σ) be a regular expression, and let out ( r ) be as defined in Figure 1.Then out ( r ) = { ( a, r ′ ) | r a −→ r ′ } .Proof. By structural induction on r . The details are left to the reader.Algorithm 1 contains pseudo-code for computing M r . It maintains four sets. • Q , a set that will eventually contain the states of M r . • F , a set that will eventually contain the accepting states of M r . • δ , a set that will eventually contain the transition relation of M r . • W , the work set , a subset of Q containing states that have not yet had their outgoingtransitions computed or acceptance status determined.The procedure begins by adding r , its input parameter, to both Q and W . It then repeatedlyremoves a state from W , determines if it should be added to F , computes its outgoingtransitions and updates δ appropriately, and finally adds the target states in the outgoingtransition set to both Q and W if they are not yet in Q (meaning they have not yet beenencountered in the construction of M r ). The algorithm terminates when W is empty.Figure 2 gives the NFA resulting from applying the procedure to ( abb + a ) ∗ . Figure 3,by way of contrast, shows the result of applying the routine in [HMU06] to produce a NFA- ε from the same regular expression. Algorithm 1:
Algorithm for computing NFA M r from regular expression r Algorithm
NFA ( r ) Input :
Regular rexpression r ∈ R (Σ) Output:
NFA M r = ( Q, Σ , q I , δ, F ) Q := { r } // State set q I := r // Start state W := { r } // Working set δ := ∅ // Transition relation F := ∅ // Accepting states while W = ∅ do choose r ′ ∈ W W := W − { r ′ } if r ′ √ then F := F ∪ { r ′ } // r ′ is an accepting state T = out ( r ′ ) // Outgoing transitions of r ′ δ := δ ∪ { r ′ , a, r ′′ ) | ( a, r ′′ ) ∈ T } // Update transition relation foreach ( a, r ′′ ) ∈ T do if r ′′ Q then Q := Q ∪ { r ′′ } // r ′′ is a new expression W := W ∪ { r ′′ } end end return M r = ( Q, Σ , δ, q I , F ) 5. Discussion
The title of this note is “Better Automata through Process Algebra,” and I want to revisitit in order to explain in what respects I regard the method presented in here as producing“better automata.” Earlier I identified the following motivations that prompted me toincorporate this approach in my classroom instruction. • I wanted to produce NFAs rather than NFA- ε s. In large part this was due tomy desire not cover the notion of NFA- ε . The only place this material is used intypical automata-theory textbooks is as a vehicle for converting regular expressionsinto finite automata. By giving a construction that avoids the use of ε -transitions, Icould avoid covering NFA- ε s and devote the newly freed lecture time to other topics.Of course, this is only possible if the NFA-based construction does not require moretime to describe than the introduction of NFA- ε and the NFA- ε construction. • I wanted the construction to be one that students could apply during an examto generate finite automata from regular expressions. The classical constructionfound in [HMU06] and other books fails this test, in my opinion; while the inductivedefinitions are mathematically pleasing, they yield automata with too many statesfor students to be expected to apply them in a time-constrained setting.
ETTER AUTOMATA THROUGH PROCESS ALGEBRA 13 ( abb + a ) ∗ εbb ( abb + a ) ∗ εb ( abb + a ) ∗ ε ( abb + a ) ∗ a ab b a Figure 2: NFA( r ) for r = ( abb + a ) ∗ . • Related to the preceding point, I wanted a technique that students could imaginebeing implemented and used in the numerous applications to which regular expres-sions are applied. In such a setting, fewer states is better than more states, allthings considered.This note has attempted to argue these points by giving a construction in Definition 4.8for constructing NFAs directly from regular expressions. Theorem 4.7 estabishes that thenumber of states in these NFAs is at most one larger than the size of the regular expressionfrom which the NFAs are generated; this provides guidance in preparing exam questions, asthe size of the NFAs students can be asked to generate are tightly bounded by the size of theregular expression given in the exam. Finally, Algorithm 1 gives a “close-to-code” accountof the construction that hints at its implementability. Indeed, several years ago a couple ofstudents that I presented this material to independently implemented the algorithm.Beyond the points mentioned above, I think this approach has two other points inits favor. The first is that is provides a basis for defining other operators over regularexpressions and proving that the class of regular languages is closed with result to theseoperations. The ingredients for introducing such a new operator and proving closure ofregular languages with respect to it can be summarized as follows.(1) Extend the definition of L ( r ) given in Definition 2.4 to give a language-theoreticsemantics for the operator.(2) Extend the definitions of √ and −→ in Definitions 4.1 and 4.2 to give a small-stepoperations semantics for the operator. ε εaεbεb aε εε εε ε Figure 3: NFA- ε for ( abb + a ) ∗ .(3) Extend the proofs of Lemmas 4.3 and 4.4 to establish connections between thelanguage semantics and the operational semantics.(4) Prove that expressions extended with the new operator yield finite sets of reachableexpressions.All of these steps involve adding new cases to the existing definitions and lemmas, andaltering Theorem 4.7 in the case of the last point. Once these are done, Algorithm 1, withthe definition of out given in Figure 1 suitably modified to cover the new operator, can beused as is as a basis for constructing NFAs from these extended classes of regular languages. ETTER AUTOMATA THROUGH PROCESS ALGEBRA 15
I have used parts of this approach in the classroom to ask students to prove that syn-chronous product and interleaving operators can be shown to preserve language regularity.Other operators, such as ones from process algebra, are also candidates for these kinds ofquestions.The second feature of the approach in this paper that I believe recommends it is that theNFA construction is “on-the-fly”; the construction of a automaton from a regular expressiondoes not require the a priori construction of automata from subexpressions, meaning thatthe actual production of the automaton can be intertwined with other operations, such asthe checking of whether a word belongs to the regular expression’s language. One does notneed to wait the construction of the full automaton, in other words, before putting it touse. Criticisms that I have heard of this approach center around two issues. The first is thatthe construction of NFA M r from regular expression r does not use structural inductionon r , unlike the classical constructions in e.g. [HMU06]. I do not have much patiencewith the complaint, as the concepts that M r is built on, namely √ and −→ , are definedinductively, and the results proven about them require substantial use of induction. Theother complaint is that the notion of r a −→ r ′ is “hard to understand.” It is indeed thecase that equipping regular expressions with an operational semantics is far removed fromthe language-theoretic semantics typically given to these expressions. That said, I wouldargue that the small-step operational semantics considered here in fact exposes the essenceof the relationship between regular expressions and finite automata: this semantics enablesregular expressions to be executed, and in a way that can be captured via automata.I close this section with a brief discussion of the Berry-Sethi algorithm [BS86], whichis used in practice and produces deterministic finite automata. This feature enables theirtechnique to accommodate complementation, an operation with respect to which regularlanguages are closed but which fits uneasily with NFAs. From a pedagogical perspective,however, the algorithm suffers somewhat as number of states in a DFA can be exponentiallylarger than that size of the regular expression from which it is derived. A similar criticismcan be made of other techniques that rely on Brzozowsky derivatives [Brz64], which alsoproduce DFAs. There are interesting connections between our operational semantics andthese derivatives, but we exploit nondeterminacy to keep the sizes of the resulting finiteautomata small. 6. Conclusions and Directions for Future Work
In this note I have presented an alternative approach for converting regular expressionsinto finite automata. The method relies on defining an operational semantics for regularexpressions, and as such draws inspiration from the work on process algebra undertakenby pioneers in that field, including Jos Baeten. In contrast with classical techniques, theconstruction here does not require transitions labeled by the empty word ε , and it yieldsautomata whose state sets are proportional in size to the regular expressions they comefrom. The procedure can also be implemented in an on-the-fly manner, meaning that theproduction of the automaton can be intertwined with other analysis procedures as well.Other algorithms studied in process algebra also have pedagogical promise, in my opin-ion. One method, the Kanellakis-Smolka algorithm for computing bisimulation equiva-lence [KS90], is a case in point. Partition-refinement algorithms for computing langaugeequivalence of deterministic automata have been in existence for decades, but the details underpinning them are subtle and difficult to present in an undergraduate automata-theoryclass, where instructional time is at a premium. While not as efficient asymptotically asthe best procedures, the simplicity of the K-S technique recommends it, in my opinion,both for equivalence checking and state-machine minimization. Simulation-checking algo-rithms [HHK95] can also be used as a basis for checking language containment amongfinite automata; these are interesting because they do not require determinization of bothautomata being compared, in general. References [Ard61] Dean N Arden. Delayed-logic and finite-state machines. In , pages 133–151. IEEE, 1961.[BHR84] Stephen D. Brookes, C. A. R. Hoare, and A. W. Roscoe. A theory of communicating sequentialprocesses.
Journal of the ACM , 31(3):560–599, 1984.[BK84] J.A. Bergstra and J.W. Klop. Process algebra for synchronous communication.
Information andControl , 60(1):109–137, 1984.[BK85] Jan A. Bergstra and Jan Willem Klop. Algebra of communicating processes with abstraction.
Theoretical Computer Science , 37:77–121, 1985.[Brz64] Janusz A. Brzozowski. Derivatives of regular expressions.
Journal of the ACM (JACM) , 11(4):481–494, 1964.[BS86] Gerard Berry and Ravi Sethi. From regular expressions to deterministic automata.
TheoreticalComputer Science , 48:117–126, 1986.[HHK95] Monika Rauch Henzinger, Thomas A. Henzinger, and Peter W. Kopke. Computing simulations onfinite and infinite graphs. In
Proceedings of IEEE 36th Annual Foundations of Computer Science ,pages 453–462. IEEE, 1995.[HMU06] John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman.
Introduction to Automata Theory, Lan-guages, and Computation (3rd Edition) . Addison-Wesley Longman Publishing Co., Inc., Boston,2006.[Kle56] S.C. Kleene. Representation of events in nerve nets and finite automata. In
Automata Studies ,pages 3–41. Princeton University Press, 1956.[KS90] Paris C. Kanellakis and Scott A. Smolka. Ccs expressions, finite state processes, and three problemsof equivalence.
Information and Computation , 86(1):43–68, 1990.[Mil80] Robin Milner.
A Calculus of Communicating Systems , volume 92 of
Lecture Notes in ComputerScience . Springer, 1980.[Plo81] Gordon D Plotkin. A structural approach to operational semantics. Technical report, AarhusUniversity, Denmark, 1981.[Plo04] Gordon D Plotkin. The origins of structural operational semantics.
The Journal of Logic andAlgebraic Programming , 60:3–15, 2004.[Tho68] Ken Thompson. Programming techniques: Regular expression search algorithm.
Communicationsof the ACM , 11(6):419422, June 1968.
Appendix A. SOS Rules for √ and −→ Here are the inference rules used to define √ . They are given in the form premisesconclusion with − denoting an empty list of premises. − ε √ − r ∗ √ r √ ( r + r ) √ r √ ( r + r ) √ r √ r √ ( r · r ) √ ETTER AUTOMATA THROUGH PROCESS ALGEBRA 17
Next are the rules for −→ . − a a −→ ε r a −→ r ′ r + r a −→ r ′ r a −→ r ′ r + r a −→ r ′ r a −→ r ′ r · r a −→ r ′ · r r √ r a −→ r ′ r · r a −→ r ′ r a −→ r ′ r ∗ a −→ r ′ · ( r ∗∗