Automatic Equivalence Proofs for Non-deterministic Coalgebras
Marcello Bonsangue, Georgiana Caltais, Eugen-Ioan Goriac, Dorel Lucanu, Jan Rutten, Alexandra Silva
aa r X i v : . [ c s . L O ] M a r Automatic Equivalence Proofs for Non-deterministicCoalgebras
Marcello Bonsangue a,d , Georgiana Caltais b , Eugen-Ioan Goriac b ,Dorel Lucanu c , Jan Rutten d,e , Alexandra Silva e,d,f a LIACS - Leiden University, The Netherlands b School of Computer Science - Reykjavik University, Iceland c Faculty of Computer Science - Alexandru Ioan Cuza University, Romania d Centrum Wiskunde & Informatica, The Netherlands e Radboud University Nijmegen, The Netherlands f HASLab / INESC TEC, Universidade do Minho, Braga, Portugal
Abstract
A notion of generalized regular expressions for a large class of systems modeledas coalgebras, and an analogue of Kleene’s theorem and Kleene algebra, wererecently proposed by a subset of the authors of this paper. Examples of the sys-tems covered include infinite streams, deterministic automata, Mealy machinesand labelled transition systems. In this paper, we present a novel algorithm todecide whether two expressions are bisimilar or not. The procedure is imple-mented in the automatic theorem prover CIRC, by reducing coinduction to anentailment relation between an algebraic specification and an appropriate set ofequations. We illustrate the generality of the tool with three examples: infinitestreams of real numbers, Mealy machines and labelled transition systems.
1. Introduction
Regular expressions and finite deterministic automata (DFA’s) constitutetwo of the most basic structures in computer science. Kleene’s theorem [10]gives a fundamental correspondence between these two structures: each regu-lar expression denotes a language that can be recognized by a DFA and, con-versely, the language accepted by a DFA can be specified by a regular expression.Languages denoted by regular expressions are called regular. Two regular ex-pressions are (language) equivalent if they denote the same regular language.Salomaa [21] presented a sound and complete axiomatization (later refined byKozen in [11, 12]) for proving the equivalence of regular expressions.The above programme was applied by Milner in [15] to process behavioursand labelled transition systems (LTS’s). Milner introduced a set of expressions
Email addresses: [email protected] (Marcello Bonsangue), [email protected] (Georgiana Caltais), [email protected] (Eugen-Ioan Goriac), [email protected] (Dorel Lucanu), [email protected] (Jan Rutten), [email protected] (Alexandra Silva)
Preprint submitted to Science of Computer Programming June 18, 2018 or finite LTS’s and proved an analogue of Kleene’s Theorem: each expressiondenotes the behaviour of a finite LTS and, conversely, the behaviour of a fi-nite LTS can be specified by an expression (modulo bisimilarity). Milner alsoprovided an axiomatization for his expressions, with the property that two ex-pressions are provably equivalent if and only if they are bisimilar.Coalgebras arose in the last decade as a suitable mathematical framework tostudy state-based systems, such as DFA’s and LTS’s. For a functor G : Set → Set , a G -coalgebra or G -system is a pair ( S, g ), consisting of a set S of statesand a function g : S → G ( S ) defining the “transitions” of the states. We call thefunctor G the type of the system. For instance, DFA’s can be readily seen tocorrespond to coalgebras of the functor G ( S ) = 2 × S A and image-finite LTS’sare obtained by G ( S ) = P ω ( S ) A , where P ω is finite powerset.For coalgebras of a large class of functors, a language of regular expressions,a corresponding generalization of Kleene’s theorem, and a sound and completeaxiomatization for the associated notion of behavioral equivalence were intro-duced in [23]. Both the language of expressions and their axiomatization werederived, in a modular fashion, from the functor defining the type of the system.Algebra and related tools can be successfully used for reasoning on propertiesof systems. In this paper, we present a novel method for checking for bisimilarityof generalized regular expressions using the coinductive theorem prover CIRC [5,17]. The main novelty of the method lies on the generality of the systems it canhandle.
CIRC is a metalanguage application implemented in Maude [4], and itstarget is to prove properties over infinite data structures. It has been successfullyused for checking the equivalence of programs, and trace equivalence and strongbisimilarity of processes. The tool may be tested online and downloaded from https://fmse.info.uaic.ro/tools/Circ/ .Determining whether two expressions are equivalent is important in orderto be able to compare behavioral specifications. In the presence of a sound andcomplete axiomatization one can determine equivalence using algebraic reason-ing. A coalgebraic perspective on regular expressions has however provideda more operational/algorithmic way of checking equivalence: one constructs abisimulation relation containing both expressions. The advantage of the bisimu-lation approach is that it enables automation since the steps of the constructionare fairly mechanic and require almost no ingenuity.We remark that in theory it has been shown that both problems are inPSPACE [13, 25], but in practice bisimulation checking tends to be easier. Weillustrate this with an example, to give the reader the feeling of the more al-gorithmic nature of bisimulation. We want to stress however that we are notunderestimating the value of an algebraic treatment of regular expressions: onthe contrary, as we will show later, the axiomatization plays an important rolein guaranteeing termination of the bisimulation construction and is thereforecrucial for the main result of this article.We show below a proof of the sliding rule: a ( ba ) ∗ ≡ ( ab ) ∗ a . The algebraicproof, using the rules and equations of Kleene algebra, needs to show the two2ontainments a ( ba ) ∗ ≤ ( ab ) ∗ a and ( ab ) ∗ a ≤ a ( ba ) ∗ and it requires some ingenuity in the choice of the equation applied in each step.We show the proof for the first inequality, the other would follow a similar proofpattern. a ( ba ) ∗ ≤ ( ab ) ∗ a ⇐ a + ( ab ) ∗ a ( ba ) ≤ ( ab ) ∗ a right-star rule ⇐⇒ (1 + ( ab ) ∗ ab ) a ≤ ( ab ) ∗ a associativity and distributivity ⇐⇒ ( ab ) ∗ a ≤ ( ab ) ∗ a right expansion rule: 1 + r ∗ r = r ∗ For the coalgebraic proof, we build incrementally, and rather mechanically,a bisimulation relation containing the pair ( a ( ba ) ∗ , ( ab ) ∗ a ). We start with thepair we want to prove equivalent and then we close the relation with respectto syntactic language derivatives, also known as Brzozowski derivatives . In thecurrent example, the bisimulation relation would contain three pairs: R = { ( a ( ba ) ∗ , ( ab ) ∗ a ) , (( ba ) ∗ , b ( ab ) ∗ a + 1) , (0 , } where 1 and 0 are, respectively, the regular expressions denoting the emptyword and the empty language. In constructing this relation, no decisions weremade, and hence the suitability of bisimulation construction as an automatictechnique to prove equivalence of regular expressions.The main contributions of this paper can be summarized as follows. Wepresent a decision procedure to determine equivalence of generalized regularexpressions, which specify behaviours of many types of transition systems, in-cluding Mealy machines, labelled transition systems and infinite streams. Thevalid expressions for each system are type-checked automatically in the tool. Weillustrate the decision procedure we devised by applying it to several examples.As a vehicle of implementation, we choose CIRC , a coinductive theorem proverwhich has already been explored for the construction of bisimulations. To easethe implementation in
CIRC , we present the algebraic specifications’ counterpartof the coalgebraic framework of the generalized regular expressions mentionedabove. This enables us to automatically derive algebraic specifications thatmodel the language of expressions, and to define an appropriate equational en-tailment relation which mimics our decision procedure for checking behaviouralequivalence of expressions. The implementation of both the algebraic specifica-tion and the entailment relation in
CIRC allows for automatic reasoning on theequivalence of expressions.The present paper is an extended version of the conference paper [2]. Incomparison with the aforementioned paper we have extended the tool to dealwith non-deterministic systems. More precisely, we have included the powersetfunction in the class of functors considered. Moreover, we have included all theproofs, more examples and additional explanations on the theory behind andimplementation of the tool. 3 rganization of the paper.
Section 2 recalls the basic definitions of the languageassociated to a non-deterministic functor. Section 3 describes the decision pro-cedure to check equivalence of regular expressions. Section 4 formulates theaforementioned language as an algebraic specification, which paves the way toimplement in
CIRC the procedure to decide equivalence of expressions. Theimplementation of the decision procedure and its soundness are described inSection 5. In Section 6 we show, by means of several examples, how one cancheck bisimilarity, using
CIRC . Section 7 contains concluding remarks and point-ers for future work.
2. Regular Expressions for Non-deterministic Coalgebras
In this section, we briefly recall the basic definitions in [23].Let
Set denote the category of sets (represented by capital letters
X, Y, . . . )and functions (represented by lower case letters f, g, . . . ). We write Y X for thefamily of functions from X to Y and P ω ( X ) for the collection of finite subsets of aset X . The product of two sets X, Y is written as X × Y and has the projectionsfunctions π and π : X π ←− X × Y π −→ Y . We define X ✸ + Y = X ⊎ Y ⊎ {⊥ , ⊤} where ⊎ is the disjoint union of sets, with injections X κ −→ X ⊎ Y κ ←− Y .Note that the set X ✸ + Y is different from the classical coproduct of X and Y (which we shall denote by X + Y ), because of the two extra elements ⊥ and ⊤ .These extra elements are used to represent, respectively, underspecification andinconsistency in the specification of some systems.For each of the operations defined above on sets, there are analogous oneson functions. Let f : X → Y , f : X → Y and f : Z → W . We define thefollowing operations: f × f : X × Z → Y × W f ✸ + f : X ✸ + Z → Y ✸ + W ( f × f )( x, z ) = h f ( x ) , f ( z ) i ( f ✸ + f )( c ) = c, c ∈ {⊥ , ⊤} ( f ✸ + f )( κ i ( x )) = κ i ( f i ( x )) , i ∈ , f A : X A → Y A P ω ( f ) : P ω ( X ) → P ω ( Y ) f A ( g ) = f ◦ g P ω ( f )( X ) = { y ∈ Y | f ( x ) = y, x ∈ X } Remark 1.
For the sake of brevity, we use the notation i ∈ , n as a shorthandfor i ∈ { , . . . , n } . Note that in the definition above we are using the same symbols that wedefined above for the operations on sets. It will always be clear from the contextwhich operation is being used.In our definition of non-deterministic functors we will use constant setsequipped with an information order. In particular, we will use join-semilattices.A (bounded) join-semilattice is a set B equipped with a binary operation ∨ B anda constant ⊥ B ∈ B , such that ∨ B is commutative, associative and idempotent.The element ⊥ B is neutral with respect to ∨ B . As usual, ∨ B gives rise to apartial ordering ≤ B on the elements of B : b ≤ B b ⇔ b ∨ B b = b . Every set S can be mapped into a join-semilattice by taking B to be the set of all finitesubsets of S with empty set as ⊥ B , and union as join.4 oalgebras. A coalgebra is a pair (
S, g : S → G ( S )), where S is a set of statesand G : Set → Set is a functor. The functor G , together with the function g ,determines the transition structure (or dynamics) of the G -coalgebra [20].A coalgebra ( S, g ) is finite if S is a finite set. Definition 1 (Bisimulation).
Let (
S, f ) and (
T, g ) be two G -coalgebras. Wecall a relation R ⊆ S × T a bisimulation [9] iff( s, t ) ∈ R ⇒ ( f ( s ) , g ( t )) ∈ G ( R )where G ( R ) is defined as G ( R ) = { ( G ( π )( x ) , G ( π )( x )) | x ∈ G ( R ) } .We write s ∼ G t whenever there exists a bisimulation relation containing( s, t ) and we call ∼ G the bisimilarity relation. It is of interest to remark thatthe relation ∼ G is an equivalence relation. We shall drop the subscript G when-ever the functor G is clear from the context. In the literature, one finds differentdefinitions of bisimulation or behavioral equivalence [24]. For the class of func-tors we consider here the different notions coincide and therefore we will notdiscuss them. Non-deterministic functors.
They are functors G : Set → Set built induc-tively from the identity, and constants, using × , ✸ +, ( − ) A and P ω : NDF ∋ G :: = Id | B | G ✸ + G | G × G | G A | P ω G (1)where B is a finite join-semilattice and A is a finite set. Typical examples ofnon-deterministic functors include S = B × Id , M = ( B × Id ) A , D = 2 × Id A , Q = (1 ✸ + Id ) A , N = 2 × P ω ( Id ) A and L = 1 ✸ + P ω ( Id ) A . These functors represent,respectively, the type of streams, Mealy, deterministic, partial deterministicautomata, non-deterministic automata and labeled transition systems with ex-plicit termination. S -bisimulation is stream equality, whereas D -bisimulationcoincides with language equivalence. Remark 2.
As stated in [23], the use of join-semilattices for constant functorsand the sum ✸ + instead of the ordinary product enabled the use of underspec-ification and inconsistency (i.e., ⊤ and ⊥ , respectively) in the specification ofsystems, and moreover, has allowed the whole framework to be studied in thecategory Set . Even though underspecification and inconsistency can be capturedby a semilattice structure, and the axiomatization provides the set of expres-sions with a join-semilattice structure (therefore allowing the work directly inthe category of join-semilattices), remaining in the category
Set was chosen forsimplicity.
Next, we give the definition of the ingredient relation, which relates a non-deterministic functor G with its ingredients , i.e. , the functors used in its induc-tive construction. We shall use this relation later for typing our expressions.5 efinition 2. Let ⊳ ⊆ NDF × NDF be the least reflexive and transitive rela-tion on non-deterministic functors such that G ⊳ G × G , G ⊳ G × G , G ⊳ G ✸ + G , G ⊳ G ✸ + G , G ⊳ G A , G ⊳ P ω G . Here and throughout this document we use F ⊳ G as a shorthand for ( F , G ) ∈ ⊳ .If F ⊳ G , then F is said to be an ingredient of G . For example, 2, Id , Id A and D itself are all the ingredients of the deterministic automata functor D . A language of regular expressions for non-deterministic coalgebras.
We now associate a language of expressions
Exp G with each non-deterministicfunctor G . Definition 3 (Expressions).
Let A be a finite set, B a finite join-semilatticeand X a set of fixed-point variables. The set Exp of all expressions is given bythe following grammar, where a ∈ A , b ∈ B and x ∈ X : ε :: = x | ε ⊕ ε | γ (2)where γ is a guarded expression given by: γ :: = ∅ | γ ⊕ γ | µx.γ | b | l h ε i | r h ε i | l [ ε ] | r [ ε ] | a ( ε ) | { ε } (3)In the expression µx.γ , µ is a binder for all the free occurrences of x in γ .Variables that are not bound are free. A closed expression is an expressionwithout free occurrences of fixed-point variables x . We denote the set of closedexpressions by Exp c .The language of expressions for non-deterministic coalgebras is a general-ization of the classical notion of regular expressions: ∅ , ε ⊕ ε and µx.γ playsimilar roles to the regular expressions denoting empty language, the union oflanguages and the Kleene star. Moreover, note that, not unexpectedly, in [23], ⊕ was axiomatized as an associative, commutative and idempotent operator,with ∅ as a neutral element. The expressions l h ε i , r h ε i , l [ ε ], r [ ε ], a ( ε ) and { ε } specify the left and right hand-side of products and sums, function applicationand singleton sets, respectively. Next, we present a type assignment system forassociating expressions to non-deterministic functors. This will allow us to as-sociate with each functor G the expressions ε ∈ Exp c that are valid specificationsof G -coalgebras. Definition 4 (Type system).
We now define a typing relation ⊢ ⊆
Exp × NDF × NDF that will associate an expression ε with two non-deterministicfunctors F and G , which are related by the ingredient relation ( F is an ingredientof G ). We shall write ⊢ ε : F ⊳ G for ( ε, F , G ) ∈ ⊢ . The rules that define ⊢ are6he following: ⊢ ∅ : F ⊳ G ⊢ b : B ⊳ G ( b ∈ B ) ⊢ x : G ⊳ G ( x ∈ X ) ⊢ ε : G ⊳ G ⊢ µx.ε : G ⊳ G ⊢ ε : F ⊳ G ⊢ ε : F ⊳ G ⊢ ε ⊕ ε : F ⊳ G ⊢ ε : G ⊳ G ⊢ ε : Id ⊳ G ⊢ ε : F ⊳ G ⊢ r [ ε ] : F ✸ + F ⊳ G ⊢ ε : F ⊳ G ⊢ a ( ε ) : F A ⊳ G ( a ∈ A ) ⊢ ε : F ⊳ G ⊢ l h ε i : F × F ⊳ G ⊢ ε : F ⊳ G ⊢ r h ε i : F × F ⊳ G ⊢ ε : F ⊳ G ⊢ l [ ε ] : F ✸ + F ⊳ G ⊢ ε : F ⊳ G ⊢ { ε } : P ω F ⊳ G We can now formally define the set of G -expressions: well-typed expressionsassociated with a non-deterministic functor G . Definition 5 ( G -expressions). Let G be a non-deterministic functor and F aningredient of G . We define Exp F ⊳ G by: Exp F ⊳ G = { ε ∈ Exp c | ⊢ ε : F ⊳ G } . We define the set
Exp G of well-typed G -expressions by Exp G ⊳ G .In [23], it was proved that the set of G -expressions for a given non-deterministicfunctor G has a coalgebraic structure: δ G : Exp G → G ( Exp G )More precisely, in [23], which we refer to for the complete definition of δ G , theauthors defined a function δ F ⊳ G : Exp F ⊳ G → F ( Exp G ) and then set δ G = δ G ⊳ G .The coalgebraic structure on the set of expressions enabled the proof of aKleene like theorem. Theorem 1 (Kleene’s theorem for non-deterministic coalgebras).
Let G be a non-deterministic functor. For any ε ∈ Exp G , there exists a finite G -coalgebra ( S, g ) and s ∈ S suchthat ε ∼ s . For every finite G -coalgebra ( S, g ) and s ∈ S there exists an expression ε s ∈ Exp G such that ε s ∼ s . In order to provide the reader with intuition over the notions presentedabove, we illustrate them with an example.
Example 1.
Let us instantiate the definition of G -expressions to the functorof streams S = B × Id (the ingredients of this functor are B , Id and S itself ).Let X be a set of (recursion or) fixed-point variables. The set Exp S of streamexpressions is given by the set of closed, guarded expressions generated by thefollowing BNF grammar. For x ∈ X : Exp S ∋ ε :: = ∅ | ε ⊕ ε | µx.ε | x | l h τ i | r h ε i τ :: = ∅ | b | τ ⊕ τ (4)7ntuitively, the expression l h b i is used to specify that the head of the stream is b , while r h ε i specifies a stream whose tail behaves as specified by ε . For thetwo element join-semilattice B = { , } (with ⊥ B = 0) examples of well-typedexpressions include ∅ , l h i ⊕ r h l h∅ii and µx.r h x i ⊕ l h i . The expressions l [1], l h i ⊕ µx. S , because thefunctor S does not involve ✸ +, the subexpressions in the sum have different type,and recursion is not at the outermost level (1 has type B ⊳ S ), respectively.By applying the definition in [23], the coalgebra structure on expressions δ S would be given by: δ S : Exp S → B × Exp S δ S ( ∅ ) = h⊥ B , ∅i δ S ( ε ⊕ ε ) = h b ∨ b , ε ′ ⊕ ε ′ i where h b i , ε ′ i i = δ S ( ε i ) , i ∈ , δ S ( µx.ε ) = δ S ( ε [ µx.ε/x ]) δ S ( l h τ i ) = h δ B ⊳ S ( τ ) , ∅i δ S ( r h ε i ) = h⊥ B , ε i δ B ⊳ S ( ∅ ) = ⊥ B δ B ⊳ S ( b ) = bδ B ⊳ S ( τ ⊕ τ ′ ) = δ B ⊳ S ( τ ) ∨ δ B ⊳ S ( τ ′ )The proof of Kleene’s theorem provides algorithms to go from expressions tostreams and vice-versa. We illustrate it by means of examples.Consider the following stream: s s s , , , , , , , . . . ).To compute expressions ε , ε and ε equivalent to s , s and s we associatewith each state s i a variable x i and get the equations: ε = µx .l h i ⊕ r h x i ε = µx .l h i ⊕ r h x i ε = µx .l h i ⊕ r h x i As our goal is to remove all the occurrences of free variables in our expressions,we proceed as follows. First we substitute x by ε in ε , and x by ε in ε ,and obtain the following expressions: ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h ε i Note that at this point ε and ε already denote closed expressions. Therefore,as a last step, we replace x in ε by ε and get the following closed expressions: ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h µx .l h i ⊕ r h x ii ε ∼ s , ε ∼ s and ε ∼ s .For the converse construction, consider the expression ε = ( µx.r h x i ) ⊕ l h i .We construct an automaton by repeatedly applying the coalgebra structure onexpressions δ S , modulo associativity, commutativity and idempotence (ACI) of ⊕ in order to guarantee finiteness.First, note that δ S ( µx.r h x i ) = δ S ( r h µx.r h x ii ) = h⊥ B , µx.r h x ii . Applyingthe definition of δ S above, we have: δ S ( ε ) = h , ( µx.r h x i ) ⊕ ∅i and δ S (( µx.r h x i ) ⊕ ∅ ) = h , ( µx.r h x i ) ⊕ ∅i which leads to the following stream (automaton): ε ( µx.r h x i ) ⊕ ∅ δ S , withoutACI, might generate infinite automata. Take, for instance, the expression ε = µx.r h x ⊕ x i . Note that δ S ( µx.r h x ⊕ x i ) = h , ε ⊕ ε i , δ S ( ε ⊕ ε ) = h , ( ε ⊕ ε ) ⊕ ( ε ⊕ ε ) i ,and so on. This would generate the infinite automaton ε ε ⊕ ε ( ε ⊕ ε ) ⊕ ( ε ⊕ ε ) . . . . . . instead of the intended, simple and very finite, automaton ε ε ⊕ ∅ ≡ ∅ could also be used in order toobtain smaller automata, but it is not crucial for termination.Throughout the paper, we will often use streams as a basic example toillustrate the definitions. It should be remarked that the framework is generalenough to include more complex examples, such as deterministic automata,automata on guarded strings, Mealy machines and labelled transition systems.The latter two will be used as examples in Section 6.
3. A Decision Procedure for the Equivalence of Generalized RegularExpressions
In this section, we briefly describe the decision procedure to determinewhether two expressions are equivalent or not.The key observation is that point 1 . of Theorem 1 above guarantees thateach expression in the language for a given system can always be associated to9 finite coalgebra. Given two expressions ε and ε in the language Exp G of agiven functor G we can decide whether they are equivalent by constructing a finite bisimulation between them. This is because the finite coalgebra generatedfrom an expression contains precisely all states that one needs to construct theequivalence relation. Even though this might seem like a trivial observation, ithas very concrete consequences: for (all well-typed) generalized regular expres-sions we can always either determine that they are bisimilar, and exhibit a proofin the form of a bisimulation, or conclude that they are not bisimilar and pin-point the difference by showing why the bisimulation construction failed. Hence,we have a decision procedure for equivalence of generalized regular expressions.We will give the reader a brief example on how the equivalence check works.Further examples, for different types of systems, including examples of non-equivalence, will appear in Section 6.We will show that the stream expressions ε = µx.r h x i ⊕ l h i and ε = r h µx.r h x i ⊕ l h ii ⊕ l h i are equivalent. In order to do that, we have to build abisimulation relation R on expressions for the stream functor S , defined above,such that ( ε , ε ) ∈ R . We do this in the following way: we start by taking R = { ( ε , ε ) } and we check whether this is already a bisimulation, by applying δ S to each of the expressions and checking whether the expressions have thesame output value and, moreover, that no new pairs of expressions (moduloassociativity, commutativity and idempotence, for more details see page 25)appear when taking transitions. If new pairs of expressions appear we addthem to R and repeat the process. Intuitively, for this particular example, thetransition structure can be depicted as follows: ε R ε R = { ( ε , ε ) } ε ε R ; add it R = { ( ε , ε ) , ( ε , ε ) } ε R ε X Figure 1: Bisimulation construction
Here, we omit the output values of the expressions, which are all 0. Inthe figure above, we use the notation ε R ε to denote ( ε , ε ) ∈ R . Asillustrated in Figure 1, R = { ( ε , ε ) , ( ε , ε ) } is closed under transitions and istherefore a bisimulation. Hence, ε and ε are bisimilar and specify the sameinfinite stream (concretely, the stream with only zeros).10 . An Algebraic View on the Coalgebra of Generalized RegularExpressions Recall that our goal is to reason about equality of generalized regular expres-sions in a fully automated manner. As we showed in the introduction, obtainingthis equality can be achieved in two distinct ways: either algebraically, reason-ing with the axioms, or coalgebraically, by constructing a bisimulation relation.The latter, because of its algorithmic nature, is particularly suited for automa-tion. Automatic constructions of bisimulations have been widely explored in
CIRC and we will use this tool to implement our algorithm. This section con-tains material that enables us to soundly use
CIRC . We want to stress howeverthat the main result of the paper is the description of a decision procedure todetermine whether two expressions are equivalent or not. This procedure inturn could be implemented in any other suitable tool or even as a standaloneapplication. Choosing
CIRC was natural for us, given the pre-existent work onbisimulation constructions. In Section 5, we show that the process of generatingthe G -coalgebras associated to expressions by repeatedly applying δ G and nor-malizing the expressions obtained at each step is closely related to the provingmechanism already existent in CIRC .In Section 2, we have introduced a (theoretical) framework which, given afunctor G , allows for the uniform derivation of 1) a language Exp G for specifyingbehaviors of G -systems, and 2) a coalgebraic structure on Exp G , which providesan operational semantics to the set of expressions. In this context, given that CIRC is based on algebraic specifications, we need two things in order to reachour final goal: • extend and adapt the framework of Section 2 in order to enable the im-plementation of a tool which allows the automatic derivation of algebraicspecifications that model 1) and 2) above, to deliver to CIRC ; • provide a decision procedure, implemented in CIRC based on an equationalentailment relation , in order to check bisimilarity of expressions.In the rest of the paper we will present the algebraic setting for reasoning onbisimilarity of generalized regular expressions. A brief overview on the paral-lel between the coalgebraic concepts in [23] and their algebraic correspondentsintroduced in this section is provided later, in Figure 2.
Algebraic specifications. An algebraic specification is a triple E = ( S, Σ , E ),where S is a set of sorts , Σ is a S -sorted signature and E is a set of conditionalequations of the form ( ∀ X ) t = t ′ if ( V i ∈ I u i = v i ), where t , t ′ , u i , and v i ( i ∈ I – a set of indices for the conditions) are Σ-terms with variables in X . Wesay that the sort of the equation is s whenever t, t ′ ∈ T Σ ,s ( X ). Here, T Σ ,s ( X )denotes the set of terms of sort s of the Σ-algebra freely generated by X . If I = {} then the equation is unconditional and may be written as ( ∀ X ) t = t ′ .Let ⊢ be the equational entailment (deduction) relation defined as in [6]. Forconsistency reasons, we write E ⊢ e whenever equation e is deducible from the11quations E in E by reflexivity, symmetry, transitivity, congruence or substitu-tivity ( i.e. , whenever E ⊢ e ).In this paper, the algebraic specifications of coalgebras of generalized regularexpressions are built on top of definitions based on grammars in Backus-Naurform (BNF) such as (1) and (2). Therefore, in what follows, we introduce thegeneral technique for transforming BNF notations into algebraic specifications. From BNF grammars to algebraic specifications.
The general rule usedfor translating definitions based on BNF grammars into algebraic specificationsis as follows: each syntactical category and vocabulary is considered as a sort andeach production is considered as a constructor operation or a subsort relation.For instance, according to the grammar (1) of non-deterministic functors,we have a sort
SltName – representing the vocabulary of join-semilattices B ,a sort AlphName – for the vocabulary of the alphabets A , a sort Functor –associated to the syntactical category of the non-deterministic functors G , asubsort relation SltName < Functor representing the production G :: = B , andconstructor operations for the other productions.Generally, each production A ::= rhs gives rise to a constructor ( rhs ) → ( A ),the direction of the arrow being reversed. For instance, for grammar (1), the pro-duction G ::= Id is represented by a constant (nullary operation) Id : → Functor ,and the sum construction by the binary operation ✸ + : Functor Functor → Functor . Remark 3.
Note that the above mechanism for translating BNF grammars intoalgebraic specifications makes use of subsort relations for representing produc-tions such as G ::= B . This is because CIRC works with order-sorted algebras,and we want to keep the algebraic specifications of non-deterministic functors asclose as possible to their implementation in
CIRC . However, order-sorted alge-bras can be reduced to many-sorted algebras [6], where a subsort relation s < s ′ is modeled by an inclusion operation c s,s ′ : s → s ′ . This way, even if we useorder-sorted algebras, we remain in the framework of circular coinduction. The algebraic specifications of coalgebras of generalized regular expressionsare defined in a modular fashion, based on the specifications of: • non-deterministic functors ( G ); • generalized regular expressions ( ε ∈ Exp G ); • “transition” functions ( δ G ); • “structured” expressions ( σ ∈ F ( Exp G ), for all F ingredients of G ).Moreover, recall that for a non-deterministic functor G , bisimilarity of G -expressions is decided based on the relation lifting G over “structured” expres-sions in G ( Exp G ) (Definition 1). Therefore, the deduction relation ⊢ has to beextended to allow a restricted contextual reasoning over “structured” expres-sions in F ( Exp G ), for all ingredients F of G .12he aforementioned algebraic specifications and the extension of ⊢ are mod-eled as follows. The algebraic specification of a non-deterministic functor G . It in-cludes: • the translation of the BNF grammar (1), as presented above; • the specification of the functor ingredients, given by a sort Ingredient and aconstructor ⊳ : Functor Functor → Ingredient (according to Definition 2); • the specification of each alphabet A = { a , . . . , a n } occurring in the def-inition of G : this consists of a subsort A <
Alph , a constant a i : → A for i ∈ , n , and a distinguished constant A of sort AlphName used to referthe alphabet in the definition of the functor; • the specification of each semilattice B = ( { b , . . . , b n } , ∨ , ⊥ B ) occurring inthe definition of G : this consists of a subsort B < Slt , a constant b i : → B for i ∈ , n , a distinguished constant B of sort SltName used to referthe corresponding semilattice in the definition of the functor, and theequations defining ∨ and ⊥ B (this should be one of b i ); • an equation defining G (as a functor expression). The algebraic specification of generalized regular expressions.
It con-sists of: • (according to the BNF grammar in Definition 3) a sort Exp represent-ing expressions ε , FixpVar the sort for the vocabulary of the fixed-pointvariables, and
Slt the sort for the elements of semilattices. Moreover, weconsider constructor operations for all the productions. For example, theproduction ε ::= ε ⊕ ε is represented by an operation ⊕ : Exp Exp → Exp ,and ε ::= µx.γ is represented by µ . : FixpVar Exp → Exp . (We chose notto provide any restriction to guarantee that γ is a guarded expression, atthis stage in the definition of µ . . However, guards can be easily checkedby pattern matching, according to the grammars in Definition 3); • the specification of the substitution of a fixed-point variable with an ex-pression, given by an operation [ / ] : Exp Exp FixpVar → Exp and a setof equations – one for each constructor. For example, the equations as-sociated to ∅ and ⊕ are: ∅ [ ε/x ] = ∅ , and respectively, ( ε ⊕ ε )[ ε/x ] =( ε [ ε/x ]) ⊕ ( ε [ ε/x ]), where ε, ε , ε are G -expressions and x is a fixed-pointvariable; • the specification of the type-checking relation in Definition 4, given by anoperation : : Exp Ingredient → Bool and an equation for each inferencerule defining this relation. For example the rule ⊢ ε : F ⊳ G ⊢ ε : F ⊳ G ⊢ ε ⊕ ε : F ⊳ G
13s represented by the equation ε ⊕ ε : F ⊳ G = ε : F ⊳ G ∧ ε : F ⊳ G . Thetype-checking operator is used in order to verify whether the expressionschecked for equivalence are well-typed (Definition 5). Moreover, note thatfor the consistency of notation, algebraically we write ε : F ⊳ G to representexpressions ε of type F ⊳ G . The algebraic specification of δ G . It consists of: • the specification of the coalgebra of G -expressions δ G given by three oper-ations δ ( ) : Ingredient Exp → ExpStruct , Empty : Ingredient → ExpStruct ,and
Plus ( , ) : Ingredient ExpStruct ExpStruct → ExpStruct ; • a set of equations describing the definitions of these operations as in [23]. The algebraic specification of structured expressions.
As mentionedabove, the set of G -expressions is provided with a coalgebraic structure givenby the function δ G : Exp G → G ( Exp G ), where G ( Exp G ) can be understood as theset of expressions with structure given by G (and its ingredients). The set ofstructured expressions is defined by the following grammar: σ :: = ε | b | h σ, σ i | k ( σ ) | k ( σ ) | ⊥ | ⊤ | λ. ( a, F ⊳ G , σ ) | { σ } (5)where ε ∈ Exp G and b ∈ B . The typing rules below give precise meaning to theseexpressions. Note that ⊥ , ⊤ are two expressions coming from G = G ✸ + G , usedto denote underspecification and overspecification, respectively.The associated algebraic specification includes: • a sort ExpStruct representing expressions σ (from F ( Exp G ), with F ⊳ G ),and one operation for each production in the BNF grammar (5). Note thatthe construction λ. ( a, F ⊳ G , σ ) has as coalgebraic correspondent a function f ∈ F A ( Exp G ), and is defined by cases as follows: λ. ( a, F ⊳ G , σ )( a ′ ) = if ( a = a ′ ) then σ else Empty F ⊳ G ; • the extension of the type-checking relation to structured expressions, de-fined by: ⊢ b : B ⊳ G ⊢ b ∈ B ( Exp G ) ⊢ ε : Id ⊳ G ⊢ ε ∈ Id ( Exp G ) ⊢ ⊥ ∈ F ✸ + F ( Exp G ) ⊢ ⊤ ∈ F ✸ + F ( Exp G ) ⊢ σ ∈ F i ( Exp G ) ⊢ k i ( σ ) ∈ F ✸ + F ( Exp G ) i ∈ , ⊢ σ ∈ F i ( Exp G ) ⊢ σ ∈ F i ( Exp G ) ⊢ h σ , σ i ∈ F × F ( Exp G ) ⊢ σ ∈ F ( Exp G ) , a ∈ A ⊢ λ. ( a, F ⊳ G , σ ) ∈ F A ( Exp G ) ⊢ σ ∈ F ( Exp G ) ⊢ { σ } ∈ P ω F ( Exp G )and specified by an operation ∈ ( Exp ) :
ExpStruct Functor Functor → ool (where we used a mix-fix notation) and an equation for each ofthe above inference rules. For example, the first rule has associated theequation b ∈ B ( Exp G ) = b : B ⊳ G . For consistency of notation, we write σ ∈ F ( Exp G ) to denote that σ is an element of F ( Exp G ). Remark 4.
In terms of membership equational logic (MEL) [3], both F ⊳ G and F ( Exp G ) can be thought of as being sorts and, for example, ε : F ⊳ G as amembership assertion. Even if MEL is an elegant theory, we prefer not to use ithere because this implies the dynamic declaration of sorts and a set of assertionsfor such a sort. The above approach is generic and therefore more flexible. The equational entailment relation ⊢ NDF for bisimilarity checking.
As previously hinted in the beginning of this section, in order to algebraicallyreason on bisimilarity of G -expressions in CIRC , one has to extend the deductionrelation ⊢ to allow a restricted contextual reasoning on expressions in F ( Exp G ),for all ingredients F of a non-deterministic functor G . We call the extendedentailment ⊢ NDF .The aforementioned restriction refers to inhibiting the use of congruenceduring equational reasoning, in order to guarantee the soundness of
CIRC proofs.This is realized by means of a freezing operator , which intuitively behaves as awrapper on the expressions checked for equivalence, by changing their sort to afresh sort
Frozen . This way, the hypotheses collected during a
CIRC proof sessioncannot be used freely in contextual reasoning, hence preventing the derivationof untrue equations (as illustrated in Example 2).We further show how the freezing mechanism is implemented in our algebraicsetting, and define ⊢ NDF .Let E be an algebraic specification. We extend E by adding the freezingoperation − : s → Frozen for each sort s ∈ Σ, where
Frozen is a fresh sort. By t we represent the frozen form of a Σ-term t , and by e a frozen equation ofthe shape ( ∀ X ) t = t ′ if c . The entailment relation ⊢ is defined over frozenequations following the line in [17]; more details are provided in Section 5.Recall from Section 2 that a relation R ⊆
Exp G × Exp G is a bisimulation ifand only if ( s, t ) ∈ R ⇒ ( δ G ⊳ G ( s ) , δ G ⊳ G ( t )) ∈ G ( R ). Here, G ( R ) ⊆ G ( Exp G ) × G ( Exp G ) is the lifting of the relation R ⊆
Exp G × Exp G , defined as G ( R ) = { ( G ( π )( x ) , G ( π )( x )) | x ∈ G ( R ) } . So, intuitively, reasoning on bisimilarity of two expressions ( ε, ε ′ ) in R re-duces to checking whether the application of δ G maps them into G ( R ).Therefore, checking whether a pair ( s δ , t δ ) is in G ( R ) consists in checking,for example for the case of G = G × G , whether ( s δ , t δ ) ∈ G ( R ) and ( s δ , t δ ) ∈ G ( R ), where s δ = h s δ , s δ i and t δ = h t δ , t δ i . In an algebraic setting, thiswould reduce to building an algebraic specification E and defining an entailmentrelation ⊢ NDF such that one can infer
E ⊢
NDF h s δ , s δ i = h t δ , t δ i (this is thealgebraic correspondent we consider for ( h s δ , s δ i , h t δ , t δ i ) ∈ G ( R )) by showing15 ⊢ NDF s δ = t δ (or ( s δ , t δ ) ∈ G ( R )) and E ⊢
NDF s δ = t δ (or ( s δ , t δ ) ∈ G ( R )). We hint that the aforementioned algebraic specification E consists of E G and a set of frozen equations (see Corollary 1).The entailment relation ⊢ NDF for reasoning on bisimilarity of G -expressionsis based on the definition of G . Definition 6.
The entailment relation ⊢ NDF is the extension of ⊢ with thefollowing inference rules, which allow a restricted contextual reasoning over thefrozen equations of structured expressions: E G ⊢ NDF σ = σ ′ E G ⊢ NDF σ = σ ′ E G ⊢ NDF h σ , σ i = h σ ′ , σ ′ i (6) E G ⊢ NDF σ = σ ′ E G ⊢ NDF k i ( σ ) = k i ( σ ′ ) i ∈ , E G ⊢ NDF f ( a ) = g ( a ) , for all a ∈ A E G ⊢ NDF f = g (8) E G ⊢ NDF σ i = σ ′ j , . . . , E G ⊢ NDF σ i k = σ ′ j k E G ⊢ NDF { σ , . . . , σ n } = { σ ′ , . . . , σ ′ m } { i , . . . , i k } = { , . . . , n }{ j , . . . , j k } = { , . . . , m } (9) Remark 5.
Note that the extension of the entailment relation ⊢ to ⊢ NDF im-plies that E G ⊢ e iff E G ⊢ NDF e holds, for any equation e of shape ε = ε or ε = ε , with ε , ε non-structured expressions. Below, we will use the no-tation E G ⊢ NDF R , where R is a set of possibly frozen equations, to denote ∀ e ∈R · E G ⊢ NDF e . It is interesting to recall the relation lifting for the powerset functor whichis encoded in the last rule of Definition 6. A pair (
U, V ) is in P ω G ( R ) if andonly if for every u ∈ U there exists a v ∈ V such that ( u, v ) belongs to G ( R )and, conversely, for every v ∈ V , there exists a u ∈ U such that ( u, v ) belongsto G ( R ). Remark 6.
As already hinted (and proved in Corollary 1), reasoning on bisim-ilarity of expressions in a binary relation
R ⊆
Exp G × Exp G reduces to showingthat δ G ( s ) = δ G ( t ) is a ⊢ NDF -consequence, for all ( s, t ) ∈ R . The equationalproof is performed in a “top-down” fashion, by reasoning on the subsequentequalities between the components of the corresponding structured expression δ G ( s ) , δ G ( t ) in an inductive manner. This is realized by applying the invertedrules (6)–(9).Moreover, note that rule (9) is not invertible in the usual sense; rather anystatement matching the form of the conclusion can only be proved by some in-stance of the rule.
16e will further formalize the connection between the inductive definitionof G (on the coalgebraic side) and ⊢ NDF (on the algebraic side) in Theorem 2,hence enabling the definition of bisimulations in algebraic terms, in Corollary 1.
Remark 7.
Equations in E G (built as previously described in this section) areused in the equational reasoning only for reducing terms of shape op ( t , . . . , t n ) according to the definition of the operation op . For the simplicity of the proofsof Theorem 2 and Corollary 1, whenever we write op ( t , . . . , t n ) , we refer to theassociated term reduced according to the definition of op . First we introduce some notation conventions. Let G be a non-deterministicfunctor and R ⊆
Exp G × Exp G . We write: • R id to denote the set R ∪ { ( ε, ε ) | E G ⊢ ε : G ⊳ G = true } ; • cl ( R ) for the closure of R under transitivity, symmetry and reflexivity; • R to represent the set S e ∈R { e } ; (application of the freezing operator toall elements of R ) • δ G ⊳ G ( ε = ε ′ ) to represent the equation δ G ⊳ G ( ε ) = δ G ⊳ G ( ε ′ ); • E G ∪ R as a shorthand for ( S, Σ , E ∪ { ε = ε ′ | ( ε, ε ′ ) ∈ R} ), where E G = ( S, Σ , E ); • ( σ, σ ′ ) ∈ G ( R ) as a shorthand for: ( σ, σ ′ ) is among the enumerated ele-ments of a set S explicitly constructed as an enumeration of the finite set G ( R ) (in the algebraic setting, G ( R ) is a subset of T Σ , ExpStruct × T Σ , ExpStruct and E G ⊢ G ( R ) = S ). Theorem 2.
Consider a non-deterministic functor G . Let F be an ingredientof G , R a binary relation on the set of G -expressions, and σ, σ ′ ∈ F ( Exp G ) .a) If G is not a constant functor, then ( σ, σ ′ ) ∈ F ( cl ( R id )) iff E G ∪ R ⊢ NDF σ = σ ′ ;b) If G is a constant functor B , then ( σ, σ ′ ) ∈ B ( cl ( R id )) iff E G ⊢ NDF σ = σ ′ . In order to prove Theorem 2. a ) we introduce the following lemma: Lemma 1.
Consider G a non-deterministic functor and R a binary relation onthe set of G -expressions. If ( ε, ε ′ ) ∈ cl ( R id ) then E G ∪ R ⊢ NDF ε = ε ′ . Proof.
The proof is trivial, as equality is reflexive, symmetric and transitive. (cid:3)
We are now ready to prove Theorem 2.
Proof (Theorem 2). • Proof of Theorem 2. a ). 17 “ ⇒ ”. The proof is by induction on the structure of F . Base case : ∗ F = B . It follows that ( σ, σ ′ ) is of shape ( b, b ) where b ∈ B ,therefore E G ∪ R ⊢ NDF b = b holds by reflexivity. ∗ F = Id . In this case ( σ, σ ′ ) ∈ cl ( R id ) = Id ( cl ( R id )), so the resultfollows immediately by Lemma 1. Induction step : ∗ F = F × F . Obviously, σ = h σ , σ i and σ ′ = h σ ′ , σ ′ i , where( σ , σ ′ ) ∈ F ( cl ( R id )) and ( σ , σ ′ ) ∈ F ( cl ( R id )). Therefore,by the induction hypothesis, both E G ∪ R ⊢ NDF σ = σ ′ and E G ∪ R ⊢ NDF σ = σ ′ hold. Hence, according to the definitionof ⊢ NDF (see (6)), we conclude that E G ∪ R ⊢ NDF h σ , σ i = h σ ′ , σ ′ i holds. ∗ The cases F = F ✸ + F , F = F A and F = P ω F ′ are handled in asimilar way. • “ ⇐ ”. We proceed also by induction on the structure of F . More-over, recall that the observations in Remark 7 hold (for each of thesubsequent cases). Base case : ∗ F = B . In this case ( σ, σ ′ ) is of shape ( b, b ′ ), where b, b ′ are twoelements of the semilattice B . Also, recall that G = B , therefore,the equations (of type G ⊳ G = F ( Exp G )) in R are not involved inthe equational reasoning. We deduce that b = b ′ is proved byreflexivity, hence ( b, b ′ ) = ( b, b ) ∈ B ( cl ( R id )). ∗ F = Id . Note that for this case, σ, σ ′ are expressions of thesame type with the expressions in R . We further identify twopossibilities: · σ = σ ′ is proved by reflexivity, therefore ( σ, σ ′ ) ∈ { ( ε, ε ) | ε : G ⊳ G } ⊆ R id ⊆ cl ( R id ) = Id ( cl ( R id )). · the equations in R are used in the equational reasoning E G ∪ R ⊢ NDF σ = σ ′ . In addition, the freezing operatorinhibits contextual reasoning, therefore σ = σ ′ is provedaccording to the equations in R , based on the symmetryand transitivity of ⊢ NDF . In other words, ( σ, σ ′ ) ∈ cl ( R id ) = Id ( cl ( R id )). Induction step : ∗ F = F × F . Obviously, due to their type, the equations in R are not involved in the equational reasoning. Also, recallthat (*) holds. Therefore, E G ∪ R ⊢ NDF h σ , σ i = h σ ′ , σ ′ i is a consequence of the inverted rule (6). More explicitly, it fol-lows that E G ∪ R ⊢ NDF σ = σ ′ and E G ∪ R ⊢ NDF σ =18 ′ must hold. By the induction hypothesis, we deduce that( σ , σ ′ ) ∈ F ( cl ( R id )) and ( σ , σ ′ ) ∈ F ( cl ( R id )). So by the def-inition of F × F we conclude that ( h σ , σ i , h σ ′ , σ ′ i ) = ( σ, σ ′ ) ∈ F × F ( R ). ∗ The cases F = F ✸ + F , F = ( F ) A and F = P ω F ′ follow a similarreasoning. • Proof of Theorem 2. b ). It follows immediately by the definition of B andRemark 7. (cid:3) Corollary 1.
Let G be a non-deterministic functor and R a binary relation onthe set of G -expressions.a) If G is not a constant functor, then cl ( R id ) is a bisimulation iff E G ∪R ⊢ NDF δ G ⊳ G ( R ) ;b) If G is a constant functor B , then cl ( R id ) is a bisimulation iff E G ⊢ NDF δ G ⊳ G ( R ) . Proof. • Proof of Corollary 1. a ). We reason as follows: cl ( R id ) is a bisimulation ⇔ ( ∀ ( ε, ε ′ ) ∈ cl ( R id )) . (( δ G ⊳ G ( ε ) , δ G ⊳ G ( ε ′ )) ∈ G ( cl ( R id )) (Def. 1) ⇔ E G ∪ R ⊢ NDF δ G ⊳ G ( cl ( R id )) (Thm. 2) ⇔ E G ∪ R ⊢ NDF δ G ⊳ G ( R ) ( cl ( R id ) , ⊢ NDF ) • Proof of Corollary 1. b ). It follows immediately by the definition of bisim-ulation relations and according to the observations in Remark 7. (cid:3) In Figure 2 we briefly summarize the results of the current section, namely,the algebraic encoding of the coalgebraic setting presented in [23].
5. A Decision Procedure for Bisimilarity in
CIRC
In this section, we describe how the coinductive theorem prover
CIRC [14] canbe used to implement the decision procedure for the bisimilarity of generalizedregular expressions, which we discussed above.19oalgebraic algebraic ⊢ ε : F ⊳ G E G ⊢ ε : F ⊳ G = true Exp F ⊳ G { ε ∈ T Σ , Exp | E G ⊢ ε : F ⊳ G = true } Exp G { ε ∈ T Σ , Exp | E G ⊢ ε : G ⊳ G = true } F ( Exp G ) { σ ∈ T Σ , ExpStruct | E G ⊢ σ ∈ F ( Exp G ) = true } δ F ⊳ G : Exp F ⊳ G → F ( Exp G ) δ ( ) : Ingredient Exp → ExpStruct E G ⊢ σ ∈ F ( Exp G ) = true , E G ⊢ σ ′ ∈ F ( Exp G ) = true ( σ, σ ′ ) ∈ F ( cl ( R id )) E G ∪ R ⊢ NDF σ = σ ′ if G = B or E G ⊢ NDF σ = σ ′ if G = B (Thm. 2) cl ( R id ) is a bisimulation E G ∪ R ⊢ NDF δ G ⊳ G ( R ) if G = B or E G ⊢ NDF δ G ⊳ G ( R ) if G = B (Cor. 1) Figure 2: non-deterministic functors - coalgebraic vs. algebraic approach
CIRC can be seen as an extension of Maude with behavioral features andits implementation is derived from that of Full-Maude. In order to use theprover, one needs to provide a specification (a
CIRC theory) and a set of goals.A
CIRC theory B = ( S, (Σ , ∆) , ( E, I )) consists of an algebraic specification( S, Σ , E ), a set ∆ of derivatives , and a set I of equational interpolants, whichare expressions of the form e ⇒ { e i | i ∈ I } where e and e i are equations. Theintuition for this type of expressions is simple: e holds whenever for any i in I theequation e i holds. In other words, to prove E ⊢ e one can chose to instead prove E ⊢ { e i | i ∈ I } . For the particular case of non-deterministic functors, we useequational interpolants to extend the initial entailment relation in a consistentway with rules (6)–(9). (For more information on equational interpolants see[7]). A derivative δ ∈ ∆ is a Σ-term containing a special variable ∗ : s ( i.e. , aΣ-context), where s is the sort of the variable ∗ . If e is an equation t = t ′ with t and t ′ of sort s , then δ [ e ] is δ [ t/ ∗ : s ] = δ [ t ′ / ∗ : s ]. We call this type of equationa derivable equation . The other equations are non-derivable . We write δ [ R ] torepresent { δ [ e ] | e ∈ R} , where R is a set of derivable equations, and ∆[ e ] forthe set { δ [ e ] | δ ∈ ∆ appropriate for e } .Moreover, note that CIRC works with an extension of the entailment relation ⊢ over frozen equations (introduced in Section 4), with two more axioms, asin [17]: E ∪ R ⊢ e iff E ⊢ e (10) E ∪ R ⊢ G implies E ∪ δ [ R ] ⊢ δ [ G ] for each δ ∈ ∆ (11)Above, E ranges over unfrozen equations, e over non-derivable unfrozen20quations, and R , G over derivable frozen equations. Remark 8.
Note that the new entailment ⊢ NDF extended over frozen equations(in Definition 6) satisfies the assumptions (10) and (11).
CIRC implements the coinductive proof system given in [17] using a set ofreduction rules of the form ( B , F , G ) ⇒ ( B , F ′ , G ′ ), where B represents a speci-fication, F is the coinductive hypothesis (a set of frozen equations) and G is thecurrent set of goals. The freezing operator is defined as described in Section 4.Here is a brief description of these rules: [Done] : ( B , F , {} ) ⇒ · Whenever the set of goals is empty, the system terminates with success. [Reduce] : ( B , F , G ∪ { e } ) ⇒ ( B , F , G ) if B ∪ F ⊢ e If the current goal is a ⊢ -consequence of B ∪ F then e is removed fromthe set of goals. [Derive] : ( B , F , G ∪ { e } ) ⇒ ( B , F ∪ { e } , G ∪ ∆[ e ] ) if B ∪ F 6⊢ e When the current goal e is derivable and it is not a ⊢ -consequence, it isadded to the hypothesis and its derivatives to the set of goals. [Simplify] : ( B , F , G ∪ { θ ( e ) } ) ⇒ ( B , F , G ∪ { θ ( e i ) | i ∈ I } ) if e ⇒ { e i | i ∈ I } is an equational interpolant from thespecification and θ : X → T Σ ( Y ) is a substitution. [Fail] : ( B , F , G ∪ { e } ) ⇒ failure if B ∪ F 6⊢ e ∧ e is non-derivable This rule stops the reduction process with failure whenever the currentgoal e is non-derivable and is not a ⊢ -consequence of B ∪ F .It is worth noting that there is a strong connection between a
CIRC proofand the construction of a bisimulation relation. We illustrate this fact and theimportance of the freezing operator with a simple example.
Example 2.
Consider the case of infinite streams. The set B ω of infinitestreams over a set B is the final coalgebra of the functor S = B × Id , with acoalgebra structure given by hd and tl, the functions that return the head andthe tail of the stream, respectively. Our purpose is to prove that ∞ = (00) ∞ .Let z and zz represent the stream on the left hand side and, respectively, on theright hand side. These streams are defined by the equations: hd ( z ) = 0 , tl ( z ) = z, hd ( zz ) = 0 , tl ( zz ) = 0: zz . Note that equations over B like hd ( z ) = 0 are notderivable and equations over streams like tl ( z ) = z are derivable.In Fig. 3 we present the correlation between the CIRC proof and the con-struction of the bisimulation relation. Note how
CIRC collects the elements ofthe bisimulation as frozen hypotheses.Let us analyze what would happen if the freezing operator − were not used.Suppose the circular coinduction algorithm would add the equation z = zz inits unfrozen form to the hypotheses. After applying the derivatives we obtain IRC proof Bisimulation construction (add goal z = zz .) z zz ( zz ) ′ B , {} , { z = zz } ) F = {} ; z ∼ zz ? [Derive] −→ B , { z = zz } , ( hd ( z ) = hd ( zz ) tl ( z ) = tl ( zz ) )! F = { ( z, zz ) } ; z −→ zzz −→ ( zz ) ′ [Reduce] −→ ( B , { z = zz } , { z = 0: zz } ) F = { ( z, zz ) } ; z ∼ ( zz ) ′ ? [Derive] −→ B , n z = zzz = zz o , ( hd ( z ) = hd (0: zz ) tl ( z ) = tl (0: zz ) )! F = { ( z, zz ) , ( z, ( zz ) ′ ) } ; z −→ z ( zz ) ′ −→ zz [Reduce] −→ (cid:16) B , n z = zzz = zz o , {} (cid:17) F = { ( z, zz ) , ( z, ( zz ) ′ ) } X Figure 3: Parallel between a
CIRC proof and the bisimulation construction the goals hd ( z ) = hd ( zz ) , tl ( z ) = tl ( zz ) . At this point, the prover could use thefreshly added equation z = zz , and according to the congruence rule, both goalswould be proven directly, though we would still be in the process of showing thatthe hypothesis holds. By following a similar reasoning, we could also prove that ∞ = 1 ∞ ! In order to avoid these situations, the hypotheses are frozen, (i.e.,their sort is changed from Stream to Frozen ) and this stops the application ofthe congruence rule, forcing the application of the derivatives according to theirdefinition in the specification. Therefore, the use of the freezing operator is vitalfor the soundness of circular coinduction.
Next, we focus on using
CIRC for automatically reasoning on the equivalenceof G -expressions. As we will show, the implementation of both the algebraicspecifications associated to non-deterministic functors and the equational en-tailment relation described in Section 4 is immediate. Given a non-deterministicfunctor G , we define a CIRC theory B G = ( S, (Σ , ∆) , ( E, I )) as follows: • ( S, Σ , E ) is E G • ∆ = { δ G ⊳ G ( ∗ : Exp ) } , so the only derivable equations are those of sort Exp .As we have already seen for the example of streams, equations of sort
Slt must not be derivable. Since we have the subsort relation
Slt < Exp , weavoid the application of the derivative δ G ⊳ G ( ∗ : Exp ) over equations of sort
Slt by means of an interpolant (see below). • I consists of the following equational interpolants , whose role is to replace22urrent proof obligations over non-trivial structures with simpler ones: h σ , σ i = h σ ′ , σ ′ i ⇒ { σ = σ ′ , σ = σ ′ } (12) k i ( σ ) = k i ( σ ′ ) ⇒ { σ = σ ′ } (13) f = g ⇒ { f ( a ) = g ( a ) | a ∈ A } (14) ∪ i ∈ ,n { σ i } = ∪ j ∈ ,m { σ ′ j } ⇒ {∧ i ∈ ,n ( ∨ j ∈ ,m σ i = σ ′ j ) ∧ j ∈ ,m ( ∨ i ∈ ,n σ i = σ ′ j ) } (15)together with an equational interpolant t = t ′ ⇒ { t ≃ t ′ = true } (16)where ≃ is the equality predicate equationally defined over the sort Slt .The last interpolant transforms the equations of sort
Slt from derivable(because of the subsort relation
Slt < Exp ) into non-derivable and equiv-alent ones.The interpolants (12–16) in I extend the entailment relation ⊢ NDF (intro-duced in Definition 6) as follows: E ⊢ NDF { e i | i ∈ I } E ⊢ NDF e if e ⇒ { e i | i ∈ I } in I Theorem 3 (Soundness).
Let G be a non-deterministic functor, and G a bi-nary relation on the set of G -expressions.If ( B G , F = {} , G = G ) ∗ ⇒ ( B G , F n , G n = {} ) using [Reduce] , [Derive] and [Simplify] , then G ⊆∼ G . Proof.
The idea of the proof is to find a bisimulation relation e F s.t. G ⊆ e F .First let F represent the set of hypotheses (or derived goals) collected duringthe proof session. We distinguish between two cases:a) G = B . For this case, the set of expressions in G is given by the followinggrammar: ε :: = ∅ | b | ε ⊕ ε | µx.ε . (17)Note that the goals ε = ε ′ in G are proven1. either according to [Simplify] , applied in the context of the equationalinterpolant (16). If this is the case, then ε = ε ′ holds by reflexivity,therefore B G ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) (18)also holds; 23. or after the application of [Derive] , case in which B G ∪ F ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) holds. Moreover, note that δ B ⊳ B ( ε ) and δ B ⊳ B ( ε ′ )are reduced to b , respectively b ′ ∈ B , according to (17) and the def-inition of δ B ⊳ B . Consequently, the non-derivable (due to the subsortrelation B < Slt ) goal b = b ′ holds by reflexivity, so the followingis a sound statement: B G ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) . (19)Based on (18), (19) and Corollary 1.b), we conclude that e F = cl ( G id ) is abisimulation, hence G ⊆ cl ( G id ) ⊆ ∼ G .b) G = B . Based on the reduction rules implemented in CIRC , it is quite easyto see that the initial set of goals G is a ⊢ NDF -consequence of B G ∪ F . Inother words, G ⊆ cl ( F id ). So, if we anticipate a bit, we should show that e F = cl ( F id ) is a bisimulation, i.e. , according to Corollary 1, B G ∪ F ⊢ NDF δ G ⊳ G ( F ) . This is achieved by proving that B G ∪ F ⊢ NDF G i ( i ∈ , n )(note that δ G ⊳ G ( F ) ⊆ S i ∈ ,n G i , according to [ Derive]). The proof is byinduction on j , where n − j is the current proof step, and by case analysison the CIRC reduction rules applied at each step.We further provide a sketch of the proof.The base case j = n follows immediately, as B G ∪ F ⊢ NDF G n = ∅ .For the induction step we proceed as follows. Let e ∈ G j . If e ∈ G j +1 then B G ∪ F ⊢ NDF e by the induction hypothesis. If e
6∈ G j +1 then,for example, if [Reduce] was applied then it holds that B G ∪ F j ⊢ NDF e .Recall that F j ⊆ F , so B G ∪ F ⊢ NDF e also holds. The result followsin a similar fashion for the application of [Derive] or [Simplify] . (cid:3) Remark 9.
The soundness of the proof system we describe in this paper doesnot follow directly from Theorem 3 in [17]. This is due to the fact that we donot have an experiment-based definition of bisimilarity. So, even though themechanism we use for proving B G ∪ F ⊢ NDF δ G ⊳ G ( F ) (for the case G = B ) issimilar to the one described in [17], the current soundness proof is conceived interms of bisimulations (and not experiments). Remark 10.
The entailment relation ⊢ NDF that
CIRC uses for checking theequivalence of generalized regular expressions is an instantiation of the paramet-ric entailment relation ⊢ from the proof system in [17]. This approach allows CIRC to reason automatically on a large class of systems which can be modeledas non-deterministic coalgebras.
As already stated, our final goal is to use
CIRC as a decision procedure for thebisimilarity of generalized regular expressions. That is, whenever provided a set24f expressions, the prover stops with a yes/no answer w.r.t. their equivalence.In this context, an important aspect is that the sub-coalgebra generated by anexpression ε ∈ Exp G by repeatedly applying δ G is, in general, infinite. Take forexample the non-deterministic functor S = B × Id associated to infinite streams,and consider the property µx. ∅ ⊕ r h x i = µx.r h x i . In order to prove this, CIRC builds an infinite proof sequence by repeatedly applying δ S as follows: δ S ( µx. ∅ ⊕ r h x i ) = δ S ( µx.r h x i ) ↓h , ∅ ⊕ ( µx. ∅ ⊕ r h x i ) i = h , µx.r h x ii δ S ( ∅ ⊕ ( µx. ∅ ⊕ r h x i )) = δ S ( µx.r h x i ) ↓h , ∅ ⊕ ∅ ⊕ ( µx. ∅ ⊕ r h x i ) i = h , µx.r h x ii [. . .]In this case, the prover would never stop. We observed in Section 3 that The-orem 1 guarantees we can associate a finite coalgebra to a certain expression.In the proof of the aforementioned theorem, which is presented in [23], it isshown that the axioms for associativity, commutativity and idempotence (ACI)of ⊕ guarantee finiteness of the generated sub-coalgebra (note that these axiomshave also been proven sound w.r.t. bisimulation). ACI properties can easily bespecified in CIRC as the prover is an extension of Maude, which has a powerfulmatching modulo ACUI (ACI plus unity) capability. The idempotence is givenby the equation ε ⊕ ε = ε , and the commutativity and associativity are specifiedas attributes of ⊕ . It is interesting to remark that for the powerset functortermination is guaranteed without the axioms, because the coalgebra structureon the expressions for the powerset functor already includes ACI (since P ω ( Exp )is itself a join-semilattice).
Theorem 4.
Let G be a set of proof obligations over generalized regular expres-sions. CIRC can be used as a decision procedure for the equivalences in G , thatis, it can decide whenever a goal ( ε , ε ) ∈ G is a true or false equality. Proof.
Note that as proven in [23], the ACI axioms for ⊕ guarantee that δ G is applied for a finite number of times in the generation of the sub-coalgebraassociated to a G -expression. Therefore, it straightforwardly follows that byimplementing the ACI axioms in CIRC (as attributes of ⊕ ), the set of newgoals obtained by applying δ G is finite. In these circumstances, whenever CIRC stops according to the reduction rule [Done] , the initial proof obligations arebisimilar. On the other hand, whenever it terminates with [Fail] , the goals arenot bisimilar. (cid:3)
6. A
CIRC -based Tool
We have implemented a tool that, when provided with a functor G , auto-matically generates a specification for CIRC which can then be used in order toautomatically check whether two G -expressions are bisimilar. The tool is imple-mented as a metalanguage application in Maude. It can be downloaded from25he address http://goriac.info/tools/functorizer/ . In order to start thetool, one needs to launch Maude along with the extension Full-Maude and loadthe downloaded file using the command in functorizer.maude . The general use case consists in providing the join-semilattices, the alphabetsand the expressions. After these steps, the tool automatically checks if theprovided expressions are guarded, closed and correctly typed. If this checksucceeds, then it outputs a specification that can be further processed by
CIRC .In the end, the prover outputs either the bisimulation, if the expressions areequivalent, or a negative answer, otherwise.We present two case studies in order to emphasize the high degree of gener-ality for the types of systems we can handle, and show how the tool is used.
Example 3.
We consider the case of Mealy machines, which are coalgebras forthe functor ( B × Id ) A .Formally, a Mealy machine is a pair ( S, α ) consisting of a set S of statesand a transition function α : S → ( B × S ) A , which for each state s ∈ S andinput a ∈ A associates an output value b and a next state s ′ . Typically, we write α ( s )( a ) = ( b, s ′ ) ⇔ s a | b s ′ .In this example and in what follows we will consider for the output the two-value join-semilatice B = { , } (with ⊥ B = 0 ) and for the input alphabet A = { a, b } . The expressions for Mealy machines are given by the grammar: E :: = ∅ | x | E ⊕ E | µx.E | a ( r h E i ) | b ( r h E i ) | a ( l h E i ) | b ( l h E i ) E :: = ∅ | E ⊕ E | | E :: = ∅ | E ⊕ E | µx.E | a ( r h E i ) | b ( r h E i ) | a ( l h E i ) | b ( l h E ) Intuitively, an expression of shape a ( l h E i ) specifies a state that for an input a has an output value specified by E . For example, the expression a ( l h i ) specifies a state that for input a outputs , whereas in the case of a ( l h∅i ) theoutput is . An expression of shape a ( r h E i ) specifies a state that for a certaininput a has a transition to a new state represented by E . For example, theexpression µx.a ( r h x i ) states that for input a , the machine will perform a “ a -loop” transition, whereas a ( r h∅i ) states that for input a there is a transition tothe state denoted by ∅ . It is interesting to note that a state will only be fullyspecified in what concerns transitions and output (for a given input a if both a ( l h E i ) and a ( r h E i ) appear in the expression (combined by ⊕ ). In the caseonly transition (resp. output) are specified, the underspecification is solved bysetting the target state (resp. output) to ∅ (resp. ⊥ B = 0 ). Next, to provide the reader with intuition, we will explain how one can rea-son on the bisimilarity of two simple expressions, by constructing bisimulationrelations. Later on, we show how
CIRC can be used in conjunction with ourtool in order to act as a decision procedure when checking equivalence of twoexpressions, in a fully automated manner.We will start with the expressions ε = µx.a ( r h x i ) and ε = ∅ .We have tobuild a bisimulation relation R on G -expressions, such that ( ε , ε ) ∈ R . We26o this in the following way: we start by taking R = { ( ε , ε ) } and we checkwhether this is already a bisimulation, by considering the output values andtransitions and check whether no new expressions appear in this process. Ifnew pairs of expressions appear we add them to R and repeat the process.Intuitively, this can be represented as follows: ε a | R b | ε a | b | R = { ( ε , ε ) } ε a | ,b | ε R ε ε R ; add it a | ,b | R = { ( ε , ε ) , ( ε , ε ) } ε a | ,b | ε a | ,b | R X Figure 4: Bisimulation construction
In the figure above, and as before, we use the notation ε R ε to denote( ε , ε ) ∈ R . As illustrated in Figure 4, R = { ( ε , ε ) , ( ε , ε ) } is closed undertransitions and is therefore a bisimulation. Hence, ε ∼ G ε .The proved equality ∅ = µx.a ( r h x i ) might seem unexpected, if the reader isfamiliar with labelled transition systems. The equality is sound because theseare expressions specifying behavior of a Mealy machine and, semantically, bothdenote the function that for every non-emtpy word outputs 0 (the semantics ofMealy machines is given by functions B A + , intuitively one can think of these ex-pressions as both denoting the empty language). This is visible if one draws theautomata corresponding to both expressions (say, for simplicity, the alphabet is A = { a } ): ∅ a | µx.a ( r h x i ) a | Note that (i) the ∅ expression for Mealy machines is mapped with δ to a functionthat for input a gives h , ∅i , which represents a state with an a -loop to itself andoutput 0; (ii) the second expression specifies explicitly an a -loop to itself andit also has output 0, since no output value is explicitly defined. Now, also notethat similar expressions for labelled transition systems (LTS), or coalgebras ofthe functor P ω ( − ) A , would not be bisimilar since one would have an a-transitionand the other one not. This is because the ∅ expression for LTS really denotesa deadlock state. In operational terms they would be converted to the systems ∅ µx.a ( x ) a which now have an obvious difference in behavior.27y performing a similar reasoning as in the example above one can show thatthe expressions ε = µx.a ( r h x i ) ⊕ b ( r h x i ) and ε = µx.a ( r h x i ) are bisimilar, andthe bisimulation relation is built as illustrated in Figure 5: ε a | R b | ε a | b | R = { ( ε , ε ) } ε a | ,b | ε R ε ∅ not yet in R ; add it a | ,b | R = { ( ε , ε ) , ( ε , ∅ ) } ε a | ,b | ∅ a | ,b | R X Figure 5: Bisimulation construction
Let us further consider the Mealy machine depicted in Figure 6, where allstates are bisimilar. s a | b | a | b | b | a | s b | a | Figure 6: Mealy machine: s ∼ s We show how to check the equivalence of two expression characterizing thestates s and s , in a fully automated manner, using CIRC . These expressionsare ε = µx.b ( l h i ) ⊕ b ( r h ε i ) ⊕ a ( µy.a ( r h y i ) ⊕ b ( r h ε i ) ⊕ b ( l h i )) and ε = µx.b ( l h i ) ⊕ b ( r h x i ) ⊕ a ( r h x i ), respectively.In order to check bisimilarity of ε and ε we load the tool and define thesemilattice B = { , } and the alphabet A = { a, b } : (jslt B is 0 1 bottom 0 . 0 v 0 = 0 . 0 v 1 = 1 . 1 v 1 = 1 . endjslt)(alph A is a b endalph) We provide the functor G using the command (functor (B x Id) ^ A .) . Thecommand (set goal ... .) specifies the goal we want to prove: (set goal\mu X:FixpVar . b(l<1>) (+) a(l<0>) (+) b(r
In order to generate the
CIRC specification we use the command (generate oalgebra .) . Next we need to load CIRC along with the resulting specificationand start the proof engine using the command (coinduction .) .As already shown, behind the scenes,
CIRC builds a bisimulation relationthat includes the initial goal. The proof succeeds and the output consists of (asubset of) this bisimulation:
Proof succeeded.Number of derived goals: 2Number of proving steps performed: 50Maximum number of proving steps is set to: 256Proved properties:- phi (+) (\mu X . a(l<0>) (+) a(r
For the ease of understanding, here we printed a readable version of theproved properties. In Section 6.1, however, we show that internally each ex-pression is brought to a canonical form by renaming the variables. Moreover,note that in our tool, ∅ is represented by the constant phi . All the examplesprovided in the current section make use of this convention.As previously mentioned, CIRC is also able to detect when two expressionsare not equivalent. Take, for instance, the expressions µx.a ( l h i ) ⊕ a ( r h a ( l h i ) ⊕ a ( r h x i ) i ) and a ( l h i ) ⊕ a ( r h a ( r h µx.a ( r h x i ) ⊕ a ( l h i ) i ) ⊕ a ( l h i ) i ), characterizingthe states s and s from the Mealy machines in Fig. 7. After following somesteps similar to the ones previously enumerated, the proof fails and the outputmessage is Visible goal [...] failed during coinduction . s a | s a | s a | s a | s a | Figure 7: Mealy machines: s s Example 4.
Let us show how one may check strong bisimilarity of two nonde-terministic processes of a non-trivial CCS-like language with termination, dead-lock, and divergence, as studied in [1]. A process is a guarded, closed termdefined by the following grammar: P :: = X | δ | Ω | a.P | P + P | x | µx.P (20) where: X is the constant for successful termination, • δ denotes deadlock, • Ω is the divergent computation ( i.e. , the undefined process), • a.P is the process executing the action a and then continuing as the process P , for any action a from a given set A , • P + P is the non-deterministic process behaving as either P or P , and • µx.P is the recursive process P [ µx.P/x ] .In [23] is is shown that, up to strong bisimilarity, the above syntax of pro-cesses is equivalent to the canonical set of (guarded, closed) regular expressionsderived for the functor ✸ + P ω ( Id ) A , E :: = ∅ | E ⊕ E | x | µx.E | l [ E ] | r [ E ] E :: = ∅ | E ⊕ E | E :: = ∅ | E ⊕ E | a ( E ) E :: = ∅ | E ⊕ E | { E } The translation map ( − ) † from processes to expressions is defined by induc-tion on the structure of the process: ( X ) † = l [1] ( a.P ) † = r [ a ( { P † } )]( δ ) † = r [ ∅ ] ( P + P ) † = ( P ) † ⊕ ( P ) † (Ω) † = ∅ ( µx.P ) † = µx.P † x † = x . Consider now two processes P and Q over the alphabet A = { a, b } : P = µx. ( a.x + a.P + b.b. X + b. ( δ + Ω)) Q = µz. ( a.z + b. ( δ + b. X ) + b.δ ) where P = µy. ( a. ( y + δ ) + b.δ + b. ( δ + b. X ) + δ ) . Graphically, the two processescan be represented by the following labelled transition systems (for simplicity weomit annotating states with information regarding the satisfiability of successfultermination, divergence, and deadlock): P a ba b Q ab b P abb b bb Figure 8: Nondeterministic processes: Q ∼ P e want to check if the process P is strongly bisimilar to the process Q . Byusing the above translation, process P is represented by the expression µx. ( r [ a ( { µy. ( r [ a ( { y ⊕ r [ ∅ ] } )] ⊕ r [ b ( { r [ ∅ ] } )] ⊕ r [ b ( { r [ ∅ ] ⊕ r [ b ( { l [1] } )] } )] ⊕ r [ ∅ ]) } )] ⊕ r [ a ( { x } )] ⊕ r [ b ( { r [ b ( { l [1] } )] } )] ⊕ r [ b ( { r [ ∅ ] ⊕ ∅} )]) whereas process Q is represented by the expression µz. ( r [ a ( { z } )] ⊕ r [ b ( { r [ ∅ ] ⊕ r [ b ( { l [1] } )] } )] ⊕ r [ b ( { r [ ∅ ] } )]) . In order to use the tool, one needs to specify the semilattice, the alphabet,the functor, and the goal in a manner similar to the one previously presented: (jslt B is 1 bottom 1 . 1 v 1 = 1 . endjslt)(alph A is a b endalph)(functor B + (P Id) ^ A .)(set goal \mu X:FixpVar .r[ a( { X:FixpVar } ) ] (+)r[ a( { \mu Y:FixpVar .r[ a( { Y:FixpVar (+) r[ phi ] } ) ] (+)r[ b( { r[ phi ] } ) ] (+)r[ b( { r[ phi ] (+) r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ phi ] } )] (+)r[ b( { r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ b( { r[ phi ] (+) phi } ) ]=\mu Z:FixpVar .r[ a( { Z:FixpVar } ) ] (+)r[ b( { r[ phi ] (+) r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ b( { r[ phi ] } ) ] .) For the generated specification
CIRC terminates and outputs a positive result:
Proof succeeded.Number of derived goals: 15Number of proving steps performed: 58Maximum number of proving steps is set to: 256Proved properties:- r[phi] (+) (\mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+) r[b( { r[phi] } )](+) r[b( { r[phi] (+) r[b( { l[1] } )] } )])=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )]- r[b( { l[1] } )] = r[phi] (+) r[b( { l[1] } )]- \mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+) r[b( { r[phi] } )] (+)r[b( { r[phi] (+) r[b( { l[1] } )] } )]=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )]- \mu X. r[a( { X } )] (+) r[a( { \mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+)r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )] } )] (+) [b( { r[phi] + phi } )] (+) r[b( { r[b( { l[1] } )] } )]=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )] In this section we present details on the implementation of the algebraicspecification given in Section 4, based on the examples from Section 6.In order to generate the algebraic specifications for
CIRC when provided afunctor and two expressions we used the Maude system [4]. We choose it forits suitability for performing equational and rewriting logic based computations,and because of its reflective properties allowing for the development of advancedmetalanguage applications. As the technical aspects on how to work at themeta-level are beyond the scope of this paper, we refrain from presenting themand show, instead, what the generated specifications consist of.Most of the algebraic specifications from Section 4 have a straightforwardimplementation in Maude. Consider, for instance, the case of Mealy machinespresented in Example 3. The generated grammars for functors (1) and expres-sions (Definition 3) are coded as: sort Functor . sorts Exp ExpStruct Alph Slt .sorts AlphName SltName . subsort Exp < ExpStruct .subsort SltName < Functor . enum A is a b . enum B is 0 1 .subsort A < Alph .op A : -> AlphName . subsort B < Slt .op B : -> SltName .op G : -> Functor . op _‘(+‘)_ : Exp Exp -> Exp .op Id : -> Functor . op _‘(_‘) : Alph Exp -> Exp .op _+_ : Functor Functor -> Functor . op \mu_._ : FixpVar Exp -> Exp .op _^_ : Functor AlphName -> Functor . ops l<_> r<_> : Exp -> Exp .op _x_ : Functor Functor -> Functor . op phi : -> Exp .eq G = (B x Id) ^ A .
Most of the syntactical constructs are Maude-specific: sorts and subsort declare the sorts we work with and, respectively, the relations between them; op declares operators; eq declares equations (the equation in our case definesthe shape of the functor G ). The only CIRC -specific construct, enum , is syntacticsugar for declaring enumerable sorts, i.e. , sorts that consist only of the specifiedconstants. As a side note, if brackets ( ( , [ , { ) are used in the declaration of anoperation, then they must be preceded by a backquote ( ‘ ).As mentioned in Section 2, in order to guarantee the finiteness of our proce-dure, one needs to include the ACI axioms for (+) . Moreover, we have observedthat the unity axiom for (+) plays an important role in decreasing the numberof states generated by the repeated application of δ G , therefore improving theoverall time performance of the tool. For example, the number of rewritings CIRC performed in order to prove the bisimilarity of ε and ε in Figure 5 washalved when the unity axiom was used.By turning on the axiomatization flag using the command (axioms on .) ,the following code is generated: 32 p _‘(+‘)_ : Exp Exp -> Exp [assoc comm] .eq E:Exp (+) E:Exp = E:Exp .eq E:Exp (+) phi = E:Exp . It is an obvious question why not to add other axioms to the tool, since theunity axiom has improved performance. At this stage we do not have studied indetail how much adding other axioms would help. It is in any case a trade-off onhow many extra axioms one should include, which will get the automaton pro-duced from an expression closer to the minimal automaton, and how much timethe tool will take to reduce the expressions in each step modulo the axioms. Forclassical regular expressions, there is an interesting empirical study on this [16].We leave it as future work to carry on a similar study for our expressions andaxioms.The process of substituting fixed-point variables has a natural implementa-tion. We present the equations handling the basic expressions ∅ and x , and theoperation (+) : op _‘[_/_‘] : Exp Exp FixpVar -> Exp .eq phi [ E:Exp / X:FixpVar ] = phi .ceq Y:FixpVar [ E:Exp / X:FixpVar ] = E:Exp if (X:FixpVar == Y:FixpVar) .eq Y:FixpVar [ E:Exp / X:FixpVar ] = Y:FixpVar [owise] .eq (E1:Exp (+) E2:Exp) [ E:Exp / X:FixpVar ] =(E1:Exp [E:Exp / X:FixpVar]) (+) (E2:Exp [E:Exp / X:FixpVar]) . In order to avoid matching problems and to overpass the fact that in Maudeone cannot handle an equation that has fresh variables in its right-hand-side( i.e. , they do not appear in the left-hand-side), we replace expression variableswith parameterized constants: op var : Nat -> FixpVar .
The operation thatobtains this canonical form has an inductive definition on the structure of thegiven expression and makes use of the substitution operation presented above.For this reason, the bisimulation
CIRC builds contains parameterized constantsinstead of the user declared variables. The property proved in Example 4 is,therefore, written as: \mu var(2) . r[a( { var(2) } )] (+) r[a( { \mu var(1) . r[phi] (+)r[a( { r[phi] (+) var(1) } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+)r[b( { l[1] } )] } )] } )] (+) r[b( { r[phi] (+) phi } )] (+) r[b( { r[b( { l[1] } )] } )]=\mu var(1) . r[a( { var(1) } )] (+) r[b( { r[phi] } )] (+)r[b( { r[phi] (+) r[b( { l[1] } )] } )] The most important part of the algebraic specification consists of the equa-tions defining the operations δ ( ), Plus ( , ), and Empty . Most of these equa-tions are implemented as presented in [23]. The only difficulties we encoun-tered were for the exponentiation case, as Maude does not handle higher-orderfunctions. Without entering into details, as a workaround, we introduced anew sort
Function < ExpStruct and an operation \ . : ExpoCase Alph FunctorExpStruct -> Function in order to emulate function-passing. The first argumentis used to memorize the origin where the exponentiation ingredient is encoun-tered: δ , Plus , or
Empty . Its purpose is purely technical – we use it in order to33void some internal matching problems. The other three parameters are thoseof the structured expression λ. ( a, F ⊳ G , σ ) presented in Section 4: a letter in thealphabet, an ingredient, and some other structured expression.Another thing worth describing is the way we enable CIRC to prove equiva-lences when the powerset functor occurs. Namely, we present how interpolant(15) is implemented. Recall that we want to show that two sets of expressionsare equivalent, which means that for each expression in the first set there mustbe an equivalent one in the second set and vice-versa.In order to handle sets of structured expressions we introduce a new sort,
ExpStructSet as a supersort for
ExpStruct . We also consider the set separator , : ExpStructSet ExpStructSet -> ExpStructSet [assoc,comm] , the empty set emptyS : -> ExpStructSet , and the set wrapping operation { } : ExpStructSet-> ExpStruct . In order to mimic universal quantification over a set, we use aspecial constant referred to as token “ [/] ”. In what follows, we consider two vari-ables of sort
ExpStructSet : ES and ES’ , and two variables of sort
ExpStructSet : ESS and
ESS’ . We now describe the process of finding the equivalence betweentwo sets: • whenever encountering two wrapped expression sets we add the universalquantification token to each of them in two distinct goals: srl { ESS } = { ESS’ } => { [/] ESS } = { ESS’ } /\ { ESS } = { [/] ESS’ } . • iterate through the expressions on the left-hand-side (similarly for theother direction): srl { [/] (ES , ESS) } = { ESS’ } => { [/] ES } = { ESS’ } /\ { [/] ESS } = { ESS’ } .srl { ESS } = { [/] (ES’ , ESS’) } => { ESS } = { [/] ES’ } /\ { ESS } = { [/] ESS’ } . • when left with one expression on the left-hand-side, start iterating throughthe expressions on the right-hand-side until finding an equivalence (simi-larly for the other direction): srl { [/] ES } = { ES’ , ESS’ } => ES = ES’ \/ { [/] ES } = { ESS’ } .srl { ES , ESS } = { [/] ES’ } => ES = ES’ \/ { ESS } = { [/] ES’ } . • if no equivalence has been found, transform the current goal into a visiblefailure: srl { ESS } = emptyS => true = false .srl emptyS = { ESS } => true = false . Finally, the type checker for structured expressions has a straightforwardimplementation. Its code does not appear in the generated specification as itis only used when the tool receives the expressions as input. This preventsobtaining the specification and starting the prover in case invalid expressionsare provided. 34 . Discussion
One of the major contributions of this paper is that we provided a decisionprocedure for the bisimilarity of generalized regular expressions. In order toenable the implementation of the decision procedure, we have exploited an en-coding of coalgebra into algebra, and we formalized the equivalence between thecoalgebraic concepts associated to non-deterministic coalgebras [23] and theiralgebraic correspondents. This led to the definition of algebraic specifications( E G ) that model both the language and the coalgebraic structure of expressions.Moreover, we defined an equational deduction relation ( ⊢ NDF ), used on thealgebraic side for reasoning on the bisimilarity of expressions.The most important result of the parallel between the coalgebraic and al-gebraic approaches is given in Corollary 1, which formalizes the definition ofthe bisimulation relations in algebraic terms. Actually, this result is the key forproving the soundness of the decision procedure implemented in the automatedprover
CIRC [14]. As a coinductive prover,
CIRC builds a relation F closed un-der the application of δ G with respect to ⊢ NDF ( E G ∪ F ⊢ NDF δ G ( F ) ), henceautomatically computing a bisimulation the initial proof obligations belong to.The approach we present in this paper enables CIRC to perform reasoningbased on bisimulations (instead of experiments [17]). This way, the proveris extended to checking bisimilarity in a large class of systems that can bemodeled as non-deterministic coalgebras. Note that the constructions aboveare all automated – the (non-trivial)
CIRC algebraic specification describing E G , together with the interpolants implementing ⊢ NDF are generated with theMaude tool presented in Section 6.We now mention some of the existing coalgebraic based tools for provingbisimilarity and the main differences with the tool presented in this paper. Co-Casl [8] and CCSL [19] are tools that can generate proof obligations for theoremprovers from coalgebraic specifications. In [8] several tactics for interactive andautomatic bisimulation building are implemented in Isabelle/HOL and are usedto derive bisimilarities for translated specifications from CoCasl. The main dif-ference between our tool and CoCasl or CCSL is that, given a functor, the toolderives a specification language for which equivalence is decidable (that is, it isautomatic and not interactive). CIRC [5, 17], on top of which the current toolis built, is based on hidden logic [18] and uses a partial decision procedure forproving bisimilarities via implicit construction of bisimulations. Our tool can beseen as an extension of CIRC to a fully automatic theorem prover for the classof non-deterministic coalgebras. We stress the fact that the focus of this paperwas on a language for which equivalence is decidable. Tools such as CoCasl,CCSL or
CIRC have a more expressive language, where one can, for instance,specify streams which in our language could not be specified (intuitively, thestreams we can specify in our language are eventually periodic). In those toolsdecidability of equivalence can however not be guaranteed.There are several directions for future work.Extending the class of systems to include quantitative coalgebras (such asweighted automata and Markov chains) will enlarge the scope of applicability of35he tool. The challenge in this extension arises from the fact that the definitionof expressions for quantitative coalgebras involving the distribution monad is notas modular as for the other functors (for details see [22]). This is a consequenceof the fact that the sum of two valid expressions might not be a valid expressionanymore (since in distributions we require that the sum of probabilities add upto 1). Moreover, calculating bisimulation relations in the quantitative settingwill encompass metric manipulation, which is currently not implemented in
CIRC .To improve usability, building a graphical interface for the tool is an obviousnext step. The graphical interface should ideally allow the specification of ex-pressions by means of systems of equations (which are then solved internally) oreven by means of an automaton, which would then be translated to an expres-sion using Kleene’s theorem. We also would like to explore how adding moreaxioms than ACI to the prover (that is, each step of the bisimulation checking isperformed modulo more equations) improves the performance. Our experienceso far shows that by adding the axiom for the distribution of the ∅ expressionthrough the constructors, i.e. ∅ ⊕ ε = ε , the prover works significantly faster.We have not yet studied complexity bounds for the algorithms presented inthis paper. We conjecture however that the bounds will be very similar to thealready known for classical regular expressions [13, 25]. Further explorations inthis direction are left as future work. Acknowledgments.
We would like to thank the referees for the many construc-tive comments, which greatly helped us to improve the paper. The authors arealso grateful for useful comments from Luca Aceto, Filippo Bonchi, and MiguelPalomino Tarjuelo. The work of Georgiana Caltais and Eugen-Ioan Goriac hasbeen partially supported by the project ‘Meta-theory of Algebraic Process The-ories’ (nr. 100014021) of the Icelandic Research Fund. The work of Eugen-IoanGoriac has also been partially supported by the project ‘Extending and Ax-iomatizing Structural Operational Semantics: Theory and Tools’ (nr. 110294-0061) of the Icelandic Research Fund. The work of Dorel Lucanu has beenpartially supported by the PNII grant DAK project Contract 161/15.06.2010,SMIS-CSNR 602-12516. The work of Alexandra Silva was partially funded byERDF - European Regional Development Fund through the COMPETE Pro-gramme and by Fundao para a Cincia e a Tecnologia, Portugal within projects
FCOMP-01-0124-FEDER-020537 and
SFRH/BPD/71956/2010 . ReferencesReferences [1] L. Aceto and M. Hennessy. Termination, deadlock, and divergence.
J.ACM , 39:147–187, January 1992.[2] M. Bonsangue, G. Caltais, E.-I. Goriac, D. Lucanu, J. Rutten, and A. Silva.A decision procedure for bisimilarity of generalized regular expressions. In36 roceedings of the 13th Brazilian conference on Formal methods: founda-tions and applications , SBMF’10, pages 226–241, Berlin, Heidelberg, 2011.Springer-Verlag.[3] A. Bouhoula, J.-P. Jouannaud, and J. Meseguer. Specification and proof inmembership equational logic.
Theor. Comput. Sci. , 236(1-2):35–132, 2000.[4] M. Clavel, F. Dur´an, S. Eker, P. Lincoln, N. Mart´ı-Oliet, J. Meseguer, andC. Talcott.
All about Maude - a high-performance logical framework: howto specify, program and verify systems in rewriting logic . Springer-Verlag,Berlin, Heidelberg, 2007.[5] J. Goguen, K. Lin, and G. Rosu. Circular coinductive rewriting. In
ASE’00: Proceedings of the 15th IEEE international conference on Automatedsoftware engineering , pages 123–132, Washington, DC, USA, 2000. IEEEComputer Society.[6] J. A. Goguen. Order-sorted algebra I: Equational deduction for multipleinheritance, overloading, exceptions and partial operations.
TheoreticalComputer Science , 105:217–273, 1992.[7] E.-I. Goriac, D. Lucanu, and G. Ro¸su. Automating coinduction with caseanalysis. In
Proceedings of the 12th international conference on Formalengineering methods and software engineering , ICFEM’10, pages 220–236,Berlin, Heidelberg, 2010. Springer-Verlag.[8] D. Hausmann, T. Mossakowski, and L. Schr¨oder. Iterative Circular Coin-duction for CoCasl in Isabelle/HOL. In M. Cerioli, editor,
FASE , volume3442 of
Lecture Notes in Computer Science , pages 341–356. Springer, 2005.[9] C. Hermida and B. Jacobs. Structural induction and coinduction in afibrational setting.
Inf. Comput. , 145(2):107–152, 1998.[10] S. Kleene. Representation of events in nerve nets and finite automata.
Automata Studies , pages 3–42, 1956.[11] D. Kozen. A completeness theorem for Kleene algebras and the algebra ofregular events. In
LICS , pages 214–225. IEEE Computer Society, 1991.[12] D. Kozen. Myhill-Nerode relations on automatic systems and the com-pleteness of Kleene algebra. In A. Ferreira and H. Reichel, editors,
STACS ,volume 2010 of
Lecture Notes in Computer Science , pages 27–38. Springer,2001.[13] D. Kozen. On the coalgebraic theory of Kleene algebra with tests. Techni-cal Report http://hdl.handle.net/1813/10173 , Computing and Infor-mation Science, Cornell University, March 2008.3714] D. Lucanu, E.-I. Goriac, G. Caltais, and G. Ro¸su. CIRC: a behavioral verifi-cation tool based on circular coinduction. In
Proceedings of the 3rd interna-tional conference on Algebra and coalgebra in computer science , CALCO’09,pages 433–442, Berlin, Heidelberg, 2009. Springer-Verlag.[15] R. Milner. A complete inference system for a class of regular behaviours.
J. Comput. System Sci. , 28(3):439–466, 1984.[16] S. Owens, J. H. Reppy, and A. Turon. Regular-expression derivatives re-examined.
J. Funct. Program. , 19(2):173–190, 2009.[17] G. Ro¸su and D. Lucanu. Circular coinduction: a proof theoretical foun-dation. In
Proceedings of the 3rd international conference on Algebra andcoalgebra in computer science , CALCO’09, pages 127–144, Berlin, Heidel-berg, 2009. Springer-Verlag.[18] G. Rosu.
Hidden Logic . PhD thesis, University of California at San Diego,2000.[19] J. Rothe, H. Tews, and B. Jacobs. The coalgebraic class specificationlanguage CCSL.
J. UCS , 7(2):175–193, 2001.[20] J. J. M. M. Rutten. Universal coalgebra: a theory of systems.
Theor.Comput. Sci. , 249(1):3–80, 2000.[21] A. Salomaa. Two complete axiom systems for the algebra of regular events.
J. ACM , 13(1):158–169, 1966.[22] A. Silva, F. Bonchi, M. Bonsangue, and J. Rutten. Quantitative Kleenecoalgebras.
Information and Computation , 209(5):822–849, 2011.[23] A. Silva, M. M. Bonsangue, and J. J. M. M. Rutten. Non-deterministicKleene coalgebras.
Logical Methods in Computer Science , 6(3), 2010.[24] S. Staton. Relating coalgebraic notions of bisimulation: with applicationsto name-passing process calculi. In
Proceedings of the 3rd internationalconference on Algebra and coalgebra in computer science , CALCO’09, pages191–205, Berlin, Heidelberg, 2009. Springer-Verlag.[25] J. Worthington. Automatic proof generation in Kleene algebra. InR. Berghammer, B. M¨oller, and G. Struth, editors,
RelMiCS , volume 4988of