[PDF] Automatic Equivalence Proofs for Non-deterministic Coalgebras

Abstract

A notion of generalized regular expressions for a large class of systems modeled as coalgebras, and an analogue of Kleene's theorem and Kleene algebra, were recently proposed by a subset of the authors of this paper. Examples of the systems covered include infinite streams, deterministic automata, Mealy machines and labelled transition systems. In this paper, we present a novel algorithm to decide whether two expressions are bisimilar or not. The procedure is implemented in the automatic theorem prover CIRC, by reducing coinduction to an entailment relation between an algebraic specification and an appropriate set of equations. We illustrate the generality of the tool with three examples: infinite streams of real numbers, Mealy machines and labelled transition systems.

Full PDF

aa r X i v : . [ c s . L O ] M a r Automatic Equivalence Proofs for Non-deterministicCoalgebras

Marcello Bonsangue a,d , Georgiana Caltais b , Eugen-Ioan Goriac b ,Dorel Lucanu c , Jan Rutten d,e , Alexandra Silva e,d,f a LIACS - Leiden University, The Netherlands b School of Computer Science - Reykjavik University, Iceland c Faculty of Computer Science - Alexandru Ioan Cuza University, Romania d Centrum Wiskunde & Informatica, The Netherlands e Radboud University Nijmegen, The Netherlands f HASLab / INESC TEC, Universidade do Minho, Braga, Portugal

Abstract

A notion of generalized regular expressions for a large class of systems modeledas coalgebras, and an analogue of Kleene’s theorem and Kleene algebra, wererecently proposed by a subset of the authors of this paper. Examples of the sys-tems covered include inﬁnite streams, deterministic automata, Mealy machinesand labelled transition systems. In this paper, we present a novel algorithm todecide whether two expressions are bisimilar or not. The procedure is imple-mented in the automatic theorem prover CIRC, by reducing coinduction to anentailment relation between an algebraic speciﬁcation and an appropriate set ofequations. We illustrate the generality of the tool with three examples: inﬁnitestreams of real numbers, Mealy machines and labelled transition systems.

1. Introduction

Regular expressions and ﬁnite deterministic automata (DFA’s) constitutetwo of the most basic structures in computer science. Kleene’s theorem [10]gives a fundamental correspondence between these two structures: each regu-lar expression denotes a language that can be recognized by a DFA and, con-versely, the language accepted by a DFA can be speciﬁed by a regular expression.Languages denoted by regular expressions are called regular. Two regular ex-pressions are (language) equivalent if they denote the same regular language.Salomaa [21] presented a sound and complete axiomatization (later reﬁned byKozen in [11, 12]) for proving the equivalence of regular expressions.The above programme was applied by Milner in [15] to process behavioursand labelled transition systems (LTS’s). Milner introduced a set of expressions

Email addresses: [email protected] (Marcello Bonsangue), [email protected] (Georgiana Caltais), [email protected] (Eugen-Ioan Goriac), [email protected] (Dorel Lucanu), [email protected] (Jan Rutten), [email protected] (Alexandra Silva)

Preprint submitted to Science of Computer Programming June 18, 2018 or ﬁnite LTS’s and proved an analogue of Kleene’s Theorem: each expressiondenotes the behaviour of a ﬁnite LTS and, conversely, the behaviour of a ﬁ-nite LTS can be speciﬁed by an expression (modulo bisimilarity). Milner alsoprovided an axiomatization for his expressions, with the property that two ex-pressions are provably equivalent if and only if they are bisimilar.Coalgebras arose in the last decade as a suitable mathematical framework tostudy state-based systems, such as DFA’s and LTS’s. For a functor G : Set → Set , a G -coalgebra or G -system is a pair ( S, g ), consisting of a set S of statesand a function g : S → G ( S ) deﬁning the “transitions” of the states. We call thefunctor G the type of the system. For instance, DFA’s can be readily seen tocorrespond to coalgebras of the functor G ( S ) = 2 × S A and image-ﬁnite LTS’sare obtained by G ( S ) = P ω ( S ) A , where P ω is ﬁnite powerset.For coalgebras of a large class of functors, a language of regular expressions,a corresponding generalization of Kleene’s theorem, and a sound and completeaxiomatization for the associated notion of behavioral equivalence were intro-duced in [23]. Both the language of expressions and their axiomatization werederived, in a modular fashion, from the functor deﬁning the type of the system.Algebra and related tools can be successfully used for reasoning on propertiesof systems. In this paper, we present a novel method for checking for bisimilarityof generalized regular expressions using the coinductive theorem prover CIRC [5,17]. The main novelty of the method lies on the generality of the systems it canhandle.

CIRC is a metalanguage application implemented in Maude [4], and itstarget is to prove properties over inﬁnite data structures. It has been successfullyused for checking the equivalence of programs, and trace equivalence and strongbisimilarity of processes. The tool may be tested online and downloaded from https://fmse.info.uaic.ro/tools/Circ/ .Determining whether two expressions are equivalent is important in orderto be able to compare behavioral speciﬁcations. In the presence of a sound andcomplete axiomatization one can determine equivalence using algebraic reason-ing. A coalgebraic perspective on regular expressions has however provideda more operational/algorithmic way of checking equivalence: one constructs abisimulation relation containing both expressions. The advantage of the bisimu-lation approach is that it enables automation since the steps of the constructionare fairly mechanic and require almost no ingenuity.We remark that in theory it has been shown that both problems are inPSPACE [13, 25], but in practice bisimulation checking tends to be easier. Weillustrate this with an example, to give the reader the feeling of the more al-gorithmic nature of bisimulation. We want to stress however that we are notunderestimating the value of an algebraic treatment of regular expressions: onthe contrary, as we will show later, the axiomatization plays an important rolein guaranteeing termination of the bisimulation construction and is thereforecrucial for the main result of this article.We show below a proof of the sliding rule: a ( ba ) ∗ ≡ ( ab ) ∗ a . The algebraicproof, using the rules and equations of Kleene algebra, needs to show the two2ontainments a ( ba ) ∗ ≤ ( ab ) ∗ a and ( ab ) ∗ a ≤ a ( ba ) ∗ and it requires some ingenuity in the choice of the equation applied in each step.We show the proof for the ﬁrst inequality, the other would follow a similar proofpattern. a ( ba ) ∗ ≤ ( ab ) ∗ a ⇐ a + ( ab ) ∗ a ( ba ) ≤ ( ab ) ∗ a right-star rule ⇐⇒ (1 + ( ab ) ∗ ab ) a ≤ ( ab ) ∗ a associativity and distributivity ⇐⇒ ( ab ) ∗ a ≤ ( ab ) ∗ a right expansion rule: 1 + r ∗ r = r ∗ For the coalgebraic proof, we build incrementally, and rather mechanically,a bisimulation relation containing the pair ( a ( ba ) ∗ , ( ab ) ∗ a ). We start with thepair we want to prove equivalent and then we close the relation with respectto syntactic language derivatives, also known as Brzozowski derivatives . In thecurrent example, the bisimulation relation would contain three pairs: R = { ( a ( ba ) ∗ , ( ab ) ∗ a ) , (( ba ) ∗ , b ( ab ) ∗ a + 1) , (0 , } where 1 and 0 are, respectively, the regular expressions denoting the emptyword and the empty language. In constructing this relation, no decisions weremade, and hence the suitability of bisimulation construction as an automatictechnique to prove equivalence of regular expressions.The main contributions of this paper can be summarized as follows. Wepresent a decision procedure to determine equivalence of generalized regularexpressions, which specify behaviours of many types of transition systems, in-cluding Mealy machines, labelled transition systems and inﬁnite streams. Thevalid expressions for each system are type-checked automatically in the tool. Weillustrate the decision procedure we devised by applying it to several examples.As a vehicle of implementation, we choose CIRC , a coinductive theorem proverwhich has already been explored for the construction of bisimulations. To easethe implementation in

CIRC , we present the algebraic speciﬁcations’ counterpartof the coalgebraic framework of the generalized regular expressions mentionedabove. This enables us to automatically derive algebraic speciﬁcations thatmodel the language of expressions, and to deﬁne an appropriate equational en-tailment relation which mimics our decision procedure for checking behaviouralequivalence of expressions. The implementation of both the algebraic speciﬁca-tion and the entailment relation in

CIRC allows for automatic reasoning on theequivalence of expressions.The present paper is an extended version of the conference paper [2]. Incomparison with the aforementioned paper we have extended the tool to dealwith non-deterministic systems. More precisely, we have included the powersetfunction in the class of functors considered. Moreover, we have included all theproofs, more examples and additional explanations on the theory behind andimplementation of the tool. 3 rganization of the paper.

Section 2 recalls the basic deﬁnitions of the languageassociated to a non-deterministic functor. Section 3 describes the decision pro-cedure to check equivalence of regular expressions. Section 4 formulates theaforementioned language as an algebraic speciﬁcation, which paves the way toimplement in

CIRC the procedure to decide equivalence of expressions. Theimplementation of the decision procedure and its soundness are described inSection 5. In Section 6 we show, by means of several examples, how one cancheck bisimilarity, using

CIRC . Section 7 contains concluding remarks and point-ers for future work.

2. Regular Expressions for Non-deterministic Coalgebras

In this section, we brieﬂy recall the basic deﬁnitions in [23].Let

Set denote the category of sets (represented by capital letters

X, Y, . . . )and functions (represented by lower case letters f, g, . . . ). We write Y X for thefamily of functions from X to Y and P ω ( X ) for the collection of ﬁnite subsets of aset X . The product of two sets X, Y is written as X × Y and has the projectionsfunctions π and π : X π ←− X × Y π −→ Y . We deﬁne X ✸ + Y = X ⊎ Y ⊎ {⊥ , ⊤} where ⊎ is the disjoint union of sets, with injections X κ −→ X ⊎ Y κ ←− Y .Note that the set X ✸ + Y is diﬀerent from the classical coproduct of X and Y (which we shall denote by X + Y ), because of the two extra elements ⊥ and ⊤ .These extra elements are used to represent, respectively, underspeciﬁcation andinconsistency in the speciﬁcation of some systems.For each of the operations deﬁned above on sets, there are analogous oneson functions. Let f : X → Y , f : X → Y and f : Z → W . We deﬁne thefollowing operations: f × f : X × Z → Y × W f ✸ + f : X ✸ + Z → Y ✸ + W ( f × f )( x, z ) = h f ( x ) , f ( z ) i ( f ✸ + f )( c ) = c, c ∈ {⊥ , ⊤} ( f ✸ + f )( κ i ( x )) = κ i ( f i ( x )) , i ∈ , f A : X A → Y A P ω ( f ) : P ω ( X ) → P ω ( Y ) f A ( g ) = f ◦ g P ω ( f )( X ) = { y ∈ Y | f ( x ) = y, x ∈ X } Remark 1.

For the sake of brevity, we use the notation i ∈ , n as a shorthandfor i ∈ { , . . . , n } . Note that in the deﬁnition above we are using the same symbols that wedeﬁned above for the operations on sets. It will always be clear from the contextwhich operation is being used.In our deﬁnition of non-deterministic functors we will use constant setsequipped with an information order. In particular, we will use join-semilattices.A (bounded) join-semilattice is a set B equipped with a binary operation ∨ B anda constant ⊥ B ∈ B , such that ∨ B is commutative, associative and idempotent.The element ⊥ B is neutral with respect to ∨ B . As usual, ∨ B gives rise to apartial ordering ≤ B on the elements of B : b ≤ B b ⇔ b ∨ B b = b . Every set S can be mapped into a join-semilattice by taking B to be the set of all ﬁnitesubsets of S with empty set as ⊥ B , and union as join.4 oalgebras. A coalgebra is a pair (

S, g : S → G ( S )), where S is a set of statesand G : Set → Set is a functor. The functor G , together with the function g ,determines the transition structure (or dynamics) of the G -coalgebra [20].A coalgebra ( S, g ) is ﬁnite if S is a ﬁnite set. Deﬁnition 1 (Bisimulation).

Let (

S, f ) and (

T, g ) be two G -coalgebras. Wecall a relation R ⊆ S × T a bisimulation [9] iﬀ( s, t ) ∈ R ⇒ ( f ( s ) , g ( t )) ∈ G ( R )where G ( R ) is deﬁned as G ( R ) = { ( G ( π )( x ) , G ( π )( x )) | x ∈ G ( R ) } .We write s ∼ G t whenever there exists a bisimulation relation containing( s, t ) and we call ∼ G the bisimilarity relation. It is of interest to remark thatthe relation ∼ G is an equivalence relation. We shall drop the subscript G when-ever the functor G is clear from the context. In the literature, one ﬁnds diﬀerentdeﬁnitions of bisimulation or behavioral equivalence [24]. For the class of func-tors we consider here the diﬀerent notions coincide and therefore we will notdiscuss them. Non-deterministic functors.

They are functors G : Set → Set built induc-tively from the identity, and constants, using × , ✸ +, ( − ) A and P ω : NDF ∋ G :: = Id | B | G ✸ + G | G × G | G A | P ω G (1)where B is a ﬁnite join-semilattice and A is a ﬁnite set. Typical examples ofnon-deterministic functors include S = B × Id , M = ( B × Id ) A , D = 2 × Id A , Q = (1 ✸ + Id ) A , N = 2 × P ω ( Id ) A and L = 1 ✸ + P ω ( Id ) A . These functors represent,respectively, the type of streams, Mealy, deterministic, partial deterministicautomata, non-deterministic automata and labeled transition systems with ex-plicit termination. S -bisimulation is stream equality, whereas D -bisimulationcoincides with language equivalence. Remark 2.

As stated in [23], the use of join-semilattices for constant functorsand the sum ✸ + instead of the ordinary product enabled the use of underspec-iﬁcation and inconsistency (i.e., ⊤ and ⊥ , respectively) in the speciﬁcation ofsystems, and moreover, has allowed the whole framework to be studied in thecategory Set . Even though underspeciﬁcation and inconsistency can be capturedby a semilattice structure, and the axiomatization provides the set of expres-sions with a join-semilattice structure (therefore allowing the work directly inthe category of join-semilattices), remaining in the category

Set was chosen forsimplicity.

Next, we give the deﬁnition of the ingredient relation, which relates a non-deterministic functor G with its ingredients , i.e. , the functors used in its induc-tive construction. We shall use this relation later for typing our expressions.5 eﬁnition 2. Let ⊳ ⊆ NDF × NDF be the least reﬂexive and transitive rela-tion on non-deterministic functors such that G ⊳ G × G , G ⊳ G × G , G ⊳ G ✸ + G , G ⊳ G ✸ + G , G ⊳ G A , G ⊳ P ω G . Here and throughout this document we use F ⊳ G as a shorthand for ( F , G ) ∈ ⊳ .If F ⊳ G , then F is said to be an ingredient of G . For example, 2, Id , Id A and D itself are all the ingredients of the deterministic automata functor D . A language of regular expressions for non-deterministic coalgebras.

We now associate a language of expressions

Exp G with each non-deterministicfunctor G . Deﬁnition 3 (Expressions).

Let A be a ﬁnite set, B a ﬁnite join-semilatticeand X a set of ﬁxed-point variables. The set Exp of all expressions is given bythe following grammar, where a ∈ A , b ∈ B and x ∈ X : ε :: = x | ε ⊕ ε | γ (2)where γ is a guarded expression given by: γ :: = ∅ | γ ⊕ γ | µx.γ | b | l h ε i | r h ε i | l [ ε ] | r [ ε ] | a ( ε ) | { ε } (3)In the expression µx.γ , µ is a binder for all the free occurrences of x in γ .Variables that are not bound are free. A closed expression is an expressionwithout free occurrences of ﬁxed-point variables x . We denote the set of closedexpressions by Exp c .The language of expressions for non-deterministic coalgebras is a general-ization of the classical notion of regular expressions: ∅ , ε ⊕ ε and µx.γ playsimilar roles to the regular expressions denoting empty language, the union oflanguages and the Kleene star. Moreover, note that, not unexpectedly, in [23], ⊕ was axiomatized as an associative, commutative and idempotent operator,with ∅ as a neutral element. The expressions l h ε i , r h ε i , l [ ε ], r [ ε ], a ( ε ) and { ε } specify the left and right hand-side of products and sums, function applicationand singleton sets, respectively. Next, we present a type assignment system forassociating expressions to non-deterministic functors. This will allow us to as-sociate with each functor G the expressions ε ∈ Exp c that are valid speciﬁcationsof G -coalgebras. Deﬁnition 4 (Type system).

We now deﬁne a typing relation ⊢ ⊆

Exp × NDF × NDF that will associate an expression ε with two non-deterministicfunctors F and G , which are related by the ingredient relation ( F is an ingredientof G ). We shall write ⊢ ε : F ⊳ G for ( ε, F , G ) ∈ ⊢ . The rules that deﬁne ⊢ are6he following: ⊢ ∅ : F ⊳ G ⊢ b : B ⊳ G ( b ∈ B ) ⊢ x : G ⊳ G ( x ∈ X ) ⊢ ε : G ⊳ G ⊢ µx.ε : G ⊳ G ⊢ ε : F ⊳ G ⊢ ε : F ⊳ G ⊢ ε ⊕ ε : F ⊳ G ⊢ ε : G ⊳ G ⊢ ε : Id ⊳ G ⊢ ε : F ⊳ G ⊢ r [ ε ] : F ✸ + F ⊳ G ⊢ ε : F ⊳ G ⊢ a ( ε ) : F A ⊳ G ( a ∈ A ) ⊢ ε : F ⊳ G ⊢ l h ε i : F × F ⊳ G ⊢ ε : F ⊳ G ⊢ r h ε i : F × F ⊳ G ⊢ ε : F ⊳ G ⊢ l [ ε ] : F ✸ + F ⊳ G ⊢ ε : F ⊳ G ⊢ { ε } : P ω F ⊳ G We can now formally deﬁne the set of G -expressions: well-typed expressionsassociated with a non-deterministic functor G . Deﬁnition 5 ( G -expressions). Let G be a non-deterministic functor and F aningredient of G . We deﬁne Exp F ⊳ G by: Exp F ⊳ G = { ε ∈ Exp c | ⊢ ε : F ⊳ G } . We deﬁne the set

Exp G of well-typed G -expressions by Exp G ⊳ G .In [23], it was proved that the set of G -expressions for a given non-deterministicfunctor G has a coalgebraic structure: δ G : Exp G → G ( Exp G )More precisely, in [23], which we refer to for the complete deﬁnition of δ G , theauthors deﬁned a function δ F ⊳ G : Exp F ⊳ G → F ( Exp G ) and then set δ G = δ G ⊳ G .The coalgebraic structure on the set of expressions enabled the proof of aKleene like theorem. Theorem 1 (Kleene’s theorem for non-deterministic coalgebras).

Let G be a non-deterministic functor. For any ε ∈ Exp G , there exists a ﬁnite G -coalgebra ( S, g ) and s ∈ S suchthat ε ∼ s . For every ﬁnite G -coalgebra ( S, g ) and s ∈ S there exists an expression ε s ∈ Exp G such that ε s ∼ s . In order to provide the reader with intuition over the notions presentedabove, we illustrate them with an example.

Example 1.

Let us instantiate the deﬁnition of G -expressions to the functorof streams S = B × Id (the ingredients of this functor are B , Id and S itself ).Let X be a set of (recursion or) ﬁxed-point variables. The set Exp S of streamexpressions is given by the set of closed, guarded expressions generated by thefollowing BNF grammar. For x ∈ X : Exp S ∋ ε :: = ∅ | ε ⊕ ε | µx.ε | x | l h τ i | r h ε i τ :: = ∅ | b | τ ⊕ τ (4)7ntuitively, the expression l h b i is used to specify that the head of the stream is b , while r h ε i speciﬁes a stream whose tail behaves as speciﬁed by ε . For thetwo element join-semilattice B = { , } (with ⊥ B = 0) examples of well-typedexpressions include ∅ , l h i ⊕ r h l h∅ii and µx.r h x i ⊕ l h i . The expressions l [1], l h i ⊕ µx. S , because thefunctor S does not involve ✸ +, the subexpressions in the sum have diﬀerent type,and recursion is not at the outermost level (1 has type B ⊳ S ), respectively.By applying the deﬁnition in [23], the coalgebra structure on expressions δ S would be given by: δ S : Exp S → B × Exp S δ S ( ∅ ) = h⊥ B , ∅i δ S ( ε ⊕ ε ) = h b ∨ b , ε ′ ⊕ ε ′ i where h b i , ε ′ i i = δ S ( ε i ) , i ∈ , δ S ( µx.ε ) = δ S ( ε [ µx.ε/x ]) δ S ( l h τ i ) = h δ B ⊳ S ( τ ) , ∅i δ S ( r h ε i ) = h⊥ B , ε i δ B ⊳ S ( ∅ ) = ⊥ B δ B ⊳ S ( b ) = bδ B ⊳ S ( τ ⊕ τ ′ ) = δ B ⊳ S ( τ ) ∨ δ B ⊳ S ( τ ′ )The proof of Kleene’s theorem provides algorithms to go from expressions tostreams and vice-versa. We illustrate it by means of examples.Consider the following stream: s s s , , , , , , , . . . ).To compute expressions ε , ε and ε equivalent to s , s and s we associatewith each state s i a variable x i and get the equations: ε = µx .l h i ⊕ r h x i ε = µx .l h i ⊕ r h x i ε = µx .l h i ⊕ r h x i As our goal is to remove all the occurrences of free variables in our expressions,we proceed as follows. First we substitute x by ε in ε , and x by ε in ε ,and obtain the following expressions: ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h ε i Note that at this point ε and ε already denote closed expressions. Therefore,as a last step, we replace x in ε by ε and get the following closed expressions: ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h ε i ε = µx .l h i ⊕ r h µx .l h i ⊕ r h x ii ε ∼ s , ε ∼ s and ε ∼ s .For the converse construction, consider the expression ε = ( µx.r h x i ) ⊕ l h i .We construct an automaton by repeatedly applying the coalgebra structure onexpressions δ S , modulo associativity, commutativity and idempotence (ACI) of ⊕ in order to guarantee ﬁniteness.First, note that δ S ( µx.r h x i ) = δ S ( r h µx.r h x ii ) = h⊥ B , µx.r h x ii . Applyingthe deﬁnition of δ S above, we have: δ S ( ε ) = h , ( µx.r h x i ) ⊕ ∅i and δ S (( µx.r h x i ) ⊕ ∅ ) = h , ( µx.r h x i ) ⊕ ∅i which leads to the following stream (automaton): ε ( µx.r h x i ) ⊕ ∅ δ S , withoutACI, might generate inﬁnite automata. Take, for instance, the expression ε = µx.r h x ⊕ x i . Note that δ S ( µx.r h x ⊕ x i ) = h , ε ⊕ ε i , δ S ( ε ⊕ ε ) = h , ( ε ⊕ ε ) ⊕ ( ε ⊕ ε ) i ,and so on. This would generate the inﬁnite automaton ε ε ⊕ ε ( ε ⊕ ε ) ⊕ ( ε ⊕ ε ) . . . . . . instead of the intended, simple and very ﬁnite, automaton ε ε ⊕ ∅ ≡ ∅ could also be used in order toobtain smaller automata, but it is not crucial for termination.Throughout the paper, we will often use streams as a basic example toillustrate the deﬁnitions. It should be remarked that the framework is generalenough to include more complex examples, such as deterministic automata,automata on guarded strings, Mealy machines and labelled transition systems.The latter two will be used as examples in Section 6.

3. A Decision Procedure for the Equivalence of Generalized RegularExpressions

In this section, we brieﬂy describe the decision procedure to determinewhether two expressions are equivalent or not.The key observation is that point 1 . of Theorem 1 above guarantees thateach expression in the language for a given system can always be associated to9 ﬁnite coalgebra. Given two expressions ε and ε in the language Exp G of agiven functor G we can decide whether they are equivalent by constructing a ﬁnite bisimulation between them. This is because the ﬁnite coalgebra generatedfrom an expression contains precisely all states that one needs to construct theequivalence relation. Even though this might seem like a trivial observation, ithas very concrete consequences: for (all well-typed) generalized regular expres-sions we can always either determine that they are bisimilar, and exhibit a proofin the form of a bisimulation, or conclude that they are not bisimilar and pin-point the diﬀerence by showing why the bisimulation construction failed. Hence,we have a decision procedure for equivalence of generalized regular expressions.We will give the reader a brief example on how the equivalence check works.Further examples, for diﬀerent types of systems, including examples of non-equivalence, will appear in Section 6.We will show that the stream expressions ε = µx.r h x i ⊕ l h i and ε = r h µx.r h x i ⊕ l h ii ⊕ l h i are equivalent. In order to do that, we have to build abisimulation relation R on expressions for the stream functor S , deﬁned above,such that ( ε , ε ) ∈ R . We do this in the following way: we start by taking R = { ( ε , ε ) } and we check whether this is already a bisimulation, by applying δ S to each of the expressions and checking whether the expressions have thesame output value and, moreover, that no new pairs of expressions (moduloassociativity, commutativity and idempotence, for more details see page 25)appear when taking transitions. If new pairs of expressions appear we addthem to R and repeat the process. Intuitively, for this particular example, thetransition structure can be depicted as follows: ε R ε R = { ( ε , ε ) } ε ε R ; add it R = { ( ε , ε ) , ( ε , ε ) } ε R ε X Figure 1: Bisimulation construction

Here, we omit the output values of the expressions, which are all 0. Inthe ﬁgure above, we use the notation ε R ε to denote ( ε , ε ) ∈ R . Asillustrated in Figure 1, R = { ( ε , ε ) , ( ε , ε ) } is closed under transitions and istherefore a bisimulation. Hence, ε and ε are bisimilar and specify the sameinﬁnite stream (concretely, the stream with only zeros).10 . An Algebraic View on the Coalgebra of Generalized RegularExpressions Recall that our goal is to reason about equality of generalized regular expres-sions in a fully automated manner. As we showed in the introduction, obtainingthis equality can be achieved in two distinct ways: either algebraically, reason-ing with the axioms, or coalgebraically, by constructing a bisimulation relation.The latter, because of its algorithmic nature, is particularly suited for automa-tion. Automatic constructions of bisimulations have been widely explored in

CIRC and we will use this tool to implement our algorithm. This section con-tains material that enables us to soundly use

CIRC . We want to stress howeverthat the main result of the paper is the description of a decision procedure todetermine whether two expressions are equivalent or not. This procedure inturn could be implemented in any other suitable tool or even as a standaloneapplication. Choosing

CIRC was natural for us, given the pre-existent work onbisimulation constructions. In Section 5, we show that the process of generatingthe G -coalgebras associated to expressions by repeatedly applying δ G and nor-malizing the expressions obtained at each step is closely related to the provingmechanism already existent in CIRC .In Section 2, we have introduced a (theoretical) framework which, given afunctor G , allows for the uniform derivation of 1) a language Exp G for specifyingbehaviors of G -systems, and 2) a coalgebraic structure on Exp G , which providesan operational semantics to the set of expressions. In this context, given that CIRC is based on algebraic speciﬁcations, we need two things in order to reachour ﬁnal goal: • extend and adapt the framework of Section 2 in order to enable the im-plementation of a tool which allows the automatic derivation of algebraicspeciﬁcations that model 1) and 2) above, to deliver to CIRC ; • provide a decision procedure, implemented in CIRC based on an equationalentailment relation , in order to check bisimilarity of expressions.In the rest of the paper we will present the algebraic setting for reasoning onbisimilarity of generalized regular expressions. A brief overview on the paral-lel between the coalgebraic concepts in [23] and their algebraic correspondentsintroduced in this section is provided later, in Figure 2.

Algebraic speciﬁcations. An algebraic speciﬁcation is a triple E = ( S, Σ , E ),where S is a set of sorts , Σ is a S -sorted signature and E is a set of conditionalequations of the form ( ∀ X ) t = t ′ if ( V i ∈ I u i = v i ), where t , t ′ , u i , and v i ( i ∈ I – a set of indices for the conditions) are Σ-terms with variables in X . Wesay that the sort of the equation is s whenever t, t ′ ∈ T Σ ,s ( X ). Here, T Σ ,s ( X )denotes the set of terms of sort s of the Σ-algebra freely generated by X . If I = {} then the equation is unconditional and may be written as ( ∀ X ) t = t ′ .Let ⊢ be the equational entailment (deduction) relation deﬁned as in [6]. Forconsistency reasons, we write E ⊢ e whenever equation e is deducible from the11quations E in E by reﬂexivity, symmetry, transitivity, congruence or substitu-tivity ( i.e. , whenever E ⊢ e ).In this paper, the algebraic speciﬁcations of coalgebras of generalized regularexpressions are built on top of deﬁnitions based on grammars in Backus-Naurform (BNF) such as (1) and (2). Therefore, in what follows, we introduce thegeneral technique for transforming BNF notations into algebraic speciﬁcations. From BNF grammars to algebraic speciﬁcations.

The general rule usedfor translating deﬁnitions based on BNF grammars into algebraic speciﬁcationsis as follows: each syntactical category and vocabulary is considered as a sort andeach production is considered as a constructor operation or a subsort relation.For instance, according to the grammar (1) of non-deterministic functors,we have a sort

SltName – representing the vocabulary of join-semilattices B ,a sort AlphName – for the vocabulary of the alphabets A , a sort Functor –associated to the syntactical category of the non-deterministic functors G , asubsort relation SltName < Functor representing the production G :: = B , andconstructor operations for the other productions.Generally, each production A ::= rhs gives rise to a constructor ( rhs ) → ( A ),the direction of the arrow being reversed. For instance, for grammar (1), the pro-duction G ::= Id is represented by a constant (nullary operation) Id : → Functor ,and the sum construction by the binary operation ✸ + : Functor Functor → Functor . Remark 3.

Note that the above mechanism for translating BNF grammars intoalgebraic speciﬁcations makes use of subsort relations for representing produc-tions such as G ::= B . This is because CIRC works with order-sorted algebras,and we want to keep the algebraic speciﬁcations of non-deterministic functors asclose as possible to their implementation in

CIRC . However, order-sorted alge-bras can be reduced to many-sorted algebras [6], where a subsort relation s < s ′ is modeled by an inclusion operation c s,s ′ : s → s ′ . This way, even if we useorder-sorted algebras, we remain in the framework of circular coinduction. The algebraic speciﬁcations of coalgebras of generalized regular expressionsare deﬁned in a modular fashion, based on the speciﬁcations of: • non-deterministic functors ( G ); • generalized regular expressions ( ε ∈ Exp G ); • “transition” functions ( δ G ); • “structured” expressions ( σ ∈ F ( Exp G ), for all F ingredients of G ).Moreover, recall that for a non-deterministic functor G , bisimilarity of G -expressions is decided based on the relation lifting G over “structured” expres-sions in G ( Exp G ) (Deﬁnition 1). Therefore, the deduction relation ⊢ has to beextended to allow a restricted contextual reasoning over “structured” expres-sions in F ( Exp G ), for all ingredients F of G .12he aforementioned algebraic speciﬁcations and the extension of ⊢ are mod-eled as follows. The algebraic speciﬁcation of a non-deterministic functor G . It in-cludes: • the translation of the BNF grammar (1), as presented above; • the speciﬁcation of the functor ingredients, given by a sort Ingredient and aconstructor ⊳ : Functor Functor → Ingredient (according to Deﬁnition 2); • the speciﬁcation of each alphabet A = { a , . . . , a n } occurring in the def-inition of G : this consists of a subsort A <

Alph , a constant a i : → A for i ∈ , n , and a distinguished constant A of sort AlphName used to referthe alphabet in the deﬁnition of the functor; • the speciﬁcation of each semilattice B = ( { b , . . . , b n } , ∨ , ⊥ B ) occurring inthe deﬁnition of G : this consists of a subsort B < Slt , a constant b i : → B for i ∈ , n , a distinguished constant B of sort SltName used to referthe corresponding semilattice in the deﬁnition of the functor, and theequations deﬁning ∨ and ⊥ B (this should be one of b i ); • an equation deﬁning G (as a functor expression). The algebraic speciﬁcation of generalized regular expressions.

It con-sists of: • (according to the BNF grammar in Deﬁnition 3) a sort Exp represent-ing expressions ε , FixpVar the sort for the vocabulary of the ﬁxed-pointvariables, and

Slt the sort for the elements of semilattices. Moreover, weconsider constructor operations for all the productions. For example, theproduction ε ::= ε ⊕ ε is represented by an operation ⊕ : Exp Exp → Exp ,and ε ::= µx.γ is represented by µ . : FixpVar Exp → Exp . (We chose notto provide any restriction to guarantee that γ is a guarded expression, atthis stage in the deﬁnition of µ . . However, guards can be easily checkedby pattern matching, according to the grammars in Deﬁnition 3); • the speciﬁcation of the substitution of a ﬁxed-point variable with an ex-pression, given by an operation [ / ] : Exp Exp FixpVar → Exp and a setof equations – one for each constructor. For example, the equations as-sociated to ∅ and ⊕ are: ∅ [ ε/x ] = ∅ , and respectively, ( ε ⊕ ε )[ ε/x ] =( ε [ ε/x ]) ⊕ ( ε [ ε/x ]), where ε, ε , ε are G -expressions and x is a ﬁxed-pointvariable; • the speciﬁcation of the type-checking relation in Deﬁnition 4, given by anoperation : : Exp Ingredient → Bool and an equation for each inferencerule deﬁning this relation. For example the rule ⊢ ε : F ⊳ G ⊢ ε : F ⊳ G ⊢ ε ⊕ ε : F ⊳ G

13s represented by the equation ε ⊕ ε : F ⊳ G = ε : F ⊳ G ∧ ε : F ⊳ G . Thetype-checking operator is used in order to verify whether the expressionschecked for equivalence are well-typed (Deﬁnition 5). Moreover, note thatfor the consistency of notation, algebraically we write ε : F ⊳ G to representexpressions ε of type F ⊳ G . The algebraic speciﬁcation of δ G . It consists of: • the speciﬁcation of the coalgebra of G -expressions δ G given by three oper-ations δ ( ) : Ingredient Exp → ExpStruct , Empty : Ingredient → ExpStruct ,and

Plus ( , ) : Ingredient ExpStruct ExpStruct → ExpStruct ; • a set of equations describing the deﬁnitions of these operations as in [23]. The algebraic speciﬁcation of structured expressions.

As mentionedabove, the set of G -expressions is provided with a coalgebraic structure givenby the function δ G : Exp G → G ( Exp G ), where G ( Exp G ) can be understood as theset of expressions with structure given by G (and its ingredients). The set ofstructured expressions is deﬁned by the following grammar: σ :: = ε | b | h σ, σ i | k ( σ ) | k ( σ ) | ⊥ | ⊤ | λ. ( a, F ⊳ G , σ ) | { σ } (5)where ε ∈ Exp G and b ∈ B . The typing rules below give precise meaning to theseexpressions. Note that ⊥ , ⊤ are two expressions coming from G = G ✸ + G , usedto denote underspeciﬁcation and overspeciﬁcation, respectively.The associated algebraic speciﬁcation includes: • a sort ExpStruct representing expressions σ (from F ( Exp G ), with F ⊳ G ),and one operation for each production in the BNF grammar (5). Note thatthe construction λ. ( a, F ⊳ G , σ ) has as coalgebraic correspondent a function f ∈ F A ( Exp G ), and is deﬁned by cases as follows: λ. ( a, F ⊳ G , σ )( a ′ ) = if ( a = a ′ ) then σ else Empty F ⊳ G ; • the extension of the type-checking relation to structured expressions, de-ﬁned by: ⊢ b : B ⊳ G ⊢ b ∈ B ( Exp G ) ⊢ ε : Id ⊳ G ⊢ ε ∈ Id ( Exp G ) ⊢ ⊥ ∈ F ✸ + F ( Exp G ) ⊢ ⊤ ∈ F ✸ + F ( Exp G ) ⊢ σ ∈ F i ( Exp G ) ⊢ k i ( σ ) ∈ F ✸ + F ( Exp G ) i ∈ , ⊢ σ ∈ F i ( Exp G ) ⊢ σ ∈ F i ( Exp G ) ⊢ h σ , σ i ∈ F × F ( Exp G ) ⊢ σ ∈ F ( Exp G ) , a ∈ A ⊢ λ. ( a, F ⊳ G , σ ) ∈ F A ( Exp G ) ⊢ σ ∈ F ( Exp G ) ⊢ { σ } ∈ P ω F ( Exp G )and speciﬁed by an operation ∈ ( Exp ) :

ExpStruct Functor Functor → ool (where we used a mix-ﬁx notation) and an equation for each ofthe above inference rules. For example, the ﬁrst rule has associated theequation b ∈ B ( Exp G ) = b : B ⊳ G . For consistency of notation, we write σ ∈ F ( Exp G ) to denote that σ is an element of F ( Exp G ). Remark 4.

In terms of membership equational logic (MEL) [3], both F ⊳ G and F ( Exp G ) can be thought of as being sorts and, for example, ε : F ⊳ G as amembership assertion. Even if MEL is an elegant theory, we prefer not to use ithere because this implies the dynamic declaration of sorts and a set of assertionsfor such a sort. The above approach is generic and therefore more ﬂexible. The equational entailment relation ⊢ NDF for bisimilarity checking.

As previously hinted in the beginning of this section, in order to algebraicallyreason on bisimilarity of G -expressions in CIRC , one has to extend the deductionrelation ⊢ to allow a restricted contextual reasoning on expressions in F ( Exp G ),for all ingredients F of a non-deterministic functor G . We call the extendedentailment ⊢ NDF .The aforementioned restriction refers to inhibiting the use of congruenceduring equational reasoning, in order to guarantee the soundness of

CIRC proofs.This is realized by means of a freezing operator , which intuitively behaves as awrapper on the expressions checked for equivalence, by changing their sort to afresh sort

Frozen . This way, the hypotheses collected during a

CIRC proof sessioncannot be used freely in contextual reasoning, hence preventing the derivationof untrue equations (as illustrated in Example 2).We further show how the freezing mechanism is implemented in our algebraicsetting, and deﬁne ⊢ NDF .Let E be an algebraic speciﬁcation. We extend E by adding the freezingoperation − : s → Frozen for each sort s ∈ Σ, where

Frozen is a fresh sort. By t we represent the frozen form of a Σ-term t , and by e a frozen equation ofthe shape ( ∀ X ) t = t ′ if c . The entailment relation ⊢ is deﬁned over frozenequations following the line in [17]; more details are provided in Section 5.Recall from Section 2 that a relation R ⊆

Exp G × Exp G is a bisimulation ifand only if ( s, t ) ∈ R ⇒ ( δ G ⊳ G ( s ) , δ G ⊳ G ( t )) ∈ G ( R ). Here, G ( R ) ⊆ G ( Exp G ) × G ( Exp G ) is the lifting of the relation R ⊆

Exp G × Exp G , deﬁned as G ( R ) = { ( G ( π )( x ) , G ( π )( x )) | x ∈ G ( R ) } . So, intuitively, reasoning on bisimilarity of two expressions ( ε, ε ′ ) in R re-duces to checking whether the application of δ G maps them into G ( R ).Therefore, checking whether a pair ( s δ , t δ ) is in G ( R ) consists in checking,for example for the case of G = G × G , whether ( s δ , t δ ) ∈ G ( R ) and ( s δ , t δ ) ∈ G ( R ), where s δ = h s δ , s δ i and t δ = h t δ , t δ i . In an algebraic setting, thiswould reduce to building an algebraic speciﬁcation E and deﬁning an entailmentrelation ⊢ NDF such that one can infer

E ⊢

NDF h s δ , s δ i = h t δ , t δ i (this is thealgebraic correspondent we consider for ( h s δ , s δ i , h t δ , t δ i ) ∈ G ( R )) by showing15 ⊢ NDF s δ = t δ (or ( s δ , t δ ) ∈ G ( R )) and E ⊢

NDF s δ = t δ (or ( s δ , t δ ) ∈ G ( R )). We hint that the aforementioned algebraic speciﬁcation E consists of E G and a set of frozen equations (see Corollary 1).The entailment relation ⊢ NDF for reasoning on bisimilarity of G -expressionsis based on the deﬁnition of G . Deﬁnition 6.

The entailment relation ⊢ NDF is the extension of ⊢ with thefollowing inference rules, which allow a restricted contextual reasoning over thefrozen equations of structured expressions: E G ⊢ NDF σ = σ ′ E G ⊢ NDF σ = σ ′ E G ⊢ NDF h σ , σ i = h σ ′ , σ ′ i (6) E G ⊢ NDF σ = σ ′ E G ⊢ NDF k i ( σ ) = k i ( σ ′ ) i ∈ , E G ⊢ NDF f ( a ) = g ( a ) , for all a ∈ A E G ⊢ NDF f = g (8) E G ⊢ NDF σ i = σ ′ j , . . . , E G ⊢ NDF σ i k = σ ′ j k E G ⊢ NDF { σ , . . . , σ n } = { σ ′ , . . . , σ ′ m } { i , . . . , i k } = { , . . . , n }{ j , . . . , j k } = { , . . . , m } (9) Remark 5.

Note that the extension of the entailment relation ⊢ to ⊢ NDF im-plies that E G ⊢ e iﬀ E G ⊢ NDF e holds, for any equation e of shape ε = ε or ε = ε , with ε , ε non-structured expressions. Below, we will use the no-tation E G ⊢ NDF R , where R is a set of possibly frozen equations, to denote ∀ e ∈R · E G ⊢ NDF e . It is interesting to recall the relation lifting for the powerset functor whichis encoded in the last rule of Deﬁnition 6. A pair (

U, V ) is in P ω G ( R ) if andonly if for every u ∈ U there exists a v ∈ V such that ( u, v ) belongs to G ( R )and, conversely, for every v ∈ V , there exists a u ∈ U such that ( u, v ) belongsto G ( R ). Remark 6.

As already hinted (and proved in Corollary 1), reasoning on bisim-ilarity of expressions in a binary relation

R ⊆

Exp G × Exp G reduces to showingthat δ G ( s ) = δ G ( t ) is a ⊢ NDF -consequence, for all ( s, t ) ∈ R . The equationalproof is performed in a “top-down” fashion, by reasoning on the subsequentequalities between the components of the corresponding structured expression δ G ( s ) , δ G ( t ) in an inductive manner. This is realized by applying the invertedrules (6)–(9).Moreover, note that rule (9) is not invertible in the usual sense; rather anystatement matching the form of the conclusion can only be proved by some in-stance of the rule.

16e will further formalize the connection between the inductive deﬁnitionof G (on the coalgebraic side) and ⊢ NDF (on the algebraic side) in Theorem 2,hence enabling the deﬁnition of bisimulations in algebraic terms, in Corollary 1.

Remark 7.

Equations in E G (built as previously described in this section) areused in the equational reasoning only for reducing terms of shape op ( t , . . . , t n ) according to the deﬁnition of the operation op . For the simplicity of the proofsof Theorem 2 and Corollary 1, whenever we write op ( t , . . . , t n ) , we refer to theassociated term reduced according to the deﬁnition of op . First we introduce some notation conventions. Let G be a non-deterministicfunctor and R ⊆

Exp G × Exp G . We write: • R id to denote the set R ∪ { ( ε, ε ) | E G ⊢ ε : G ⊳ G = true } ; • cl ( R ) for the closure of R under transitivity, symmetry and reﬂexivity; • R to represent the set S e ∈R { e } ; (application of the freezing operator toall elements of R ) • δ G ⊳ G ( ε = ε ′ ) to represent the equation δ G ⊳ G ( ε ) = δ G ⊳ G ( ε ′ ); • E G ∪ R as a shorthand for ( S, Σ , E ∪ { ε = ε ′ | ( ε, ε ′ ) ∈ R} ), where E G = ( S, Σ , E ); • ( σ, σ ′ ) ∈ G ( R ) as a shorthand for: ( σ, σ ′ ) is among the enumerated ele-ments of a set S explicitly constructed as an enumeration of the ﬁnite set G ( R ) (in the algebraic setting, G ( R ) is a subset of T Σ , ExpStruct × T Σ , ExpStruct and E G ⊢ G ( R ) = S ). Theorem 2.

Consider a non-deterministic functor G . Let F be an ingredientof G , R a binary relation on the set of G -expressions, and σ, σ ′ ∈ F ( Exp G ) .a) If G is not a constant functor, then ( σ, σ ′ ) ∈ F ( cl ( R id )) iﬀ E G ∪ R ⊢ NDF σ = σ ′ ;b) If G is a constant functor B , then ( σ, σ ′ ) ∈ B ( cl ( R id )) iﬀ E G ⊢ NDF σ = σ ′ . In order to prove Theorem 2. a ) we introduce the following lemma: Lemma 1.

Consider G a non-deterministic functor and R a binary relation onthe set of G -expressions. If ( ε, ε ′ ) ∈ cl ( R id ) then E G ∪ R ⊢ NDF ε = ε ′ . Proof.

The proof is trivial, as equality is reﬂexive, symmetric and transitive. (cid:3)

We are now ready to prove Theorem 2.

Proof (Theorem 2). • Proof of Theorem 2. a ). 17 “ ⇒ ”. The proof is by induction on the structure of F . Base case : ∗ F = B . It follows that ( σ, σ ′ ) is of shape ( b, b ) where b ∈ B ,therefore E G ∪ R ⊢ NDF b = b holds by reﬂexivity. ∗ F = Id . In this case ( σ, σ ′ ) ∈ cl ( R id ) = Id ( cl ( R id )), so the resultfollows immediately by Lemma 1. Induction step : ∗ F = F × F . Obviously, σ = h σ , σ i and σ ′ = h σ ′ , σ ′ i , where( σ , σ ′ ) ∈ F ( cl ( R id )) and ( σ , σ ′ ) ∈ F ( cl ( R id )). Therefore,by the induction hypothesis, both E G ∪ R ⊢ NDF σ = σ ′ and E G ∪ R ⊢ NDF σ = σ ′ hold. Hence, according to the deﬁnitionof ⊢ NDF (see (6)), we conclude that E G ∪ R ⊢ NDF h σ , σ i = h σ ′ , σ ′ i holds. ∗ The cases F = F ✸ + F , F = F A and F = P ω F ′ are handled in asimilar way. • “ ⇐ ”. We proceed also by induction on the structure of F . More-over, recall that the observations in Remark 7 hold (for each of thesubsequent cases). Base case : ∗ F = B . In this case ( σ, σ ′ ) is of shape ( b, b ′ ), where b, b ′ are twoelements of the semilattice B . Also, recall that G = B , therefore,the equations (of type G ⊳ G = F ( Exp G )) in R are not involved inthe equational reasoning. We deduce that b = b ′ is proved byreﬂexivity, hence ( b, b ′ ) = ( b, b ) ∈ B ( cl ( R id )). ∗ F = Id . Note that for this case, σ, σ ′ are expressions of thesame type with the expressions in R . We further identify twopossibilities: · σ = σ ′ is proved by reﬂexivity, therefore ( σ, σ ′ ) ∈ { ( ε, ε ) | ε : G ⊳ G } ⊆ R id ⊆ cl ( R id ) = Id ( cl ( R id )). · the equations in R are used in the equational reasoning E G ∪ R ⊢ NDF σ = σ ′ . In addition, the freezing operatorinhibits contextual reasoning, therefore σ = σ ′ is provedaccording to the equations in R , based on the symmetryand transitivity of ⊢ NDF . In other words, ( σ, σ ′ ) ∈ cl ( R id ) = Id ( cl ( R id )). Induction step : ∗ F = F × F . Obviously, due to their type, the equations in R are not involved in the equational reasoning. Also, recallthat (*) holds. Therefore, E G ∪ R ⊢ NDF h σ , σ i = h σ ′ , σ ′ i is a consequence of the inverted rule (6). More explicitly, it fol-lows that E G ∪ R ⊢ NDF σ = σ ′ and E G ∪ R ⊢ NDF σ =18 ′ must hold. By the induction hypothesis, we deduce that( σ , σ ′ ) ∈ F ( cl ( R id )) and ( σ , σ ′ ) ∈ F ( cl ( R id )). So by the def-inition of F × F we conclude that ( h σ , σ i , h σ ′ , σ ′ i ) = ( σ, σ ′ ) ∈ F × F ( R ). ∗ The cases F = F ✸ + F , F = ( F ) A and F = P ω F ′ follow a similarreasoning. • Proof of Theorem 2. b ). It follows immediately by the deﬁnition of B andRemark 7. (cid:3) Corollary 1.

Let G be a non-deterministic functor and R a binary relation onthe set of G -expressions.a) If G is not a constant functor, then cl ( R id ) is a bisimulation iﬀ E G ∪R ⊢ NDF δ G ⊳ G ( R ) ;b) If G is a constant functor B , then cl ( R id ) is a bisimulation iﬀ E G ⊢ NDF δ G ⊳ G ( R ) . Proof. • Proof of Corollary 1. a ). We reason as follows: cl ( R id ) is a bisimulation ⇔ ( ∀ ( ε, ε ′ ) ∈ cl ( R id )) . (( δ G ⊳ G ( ε ) , δ G ⊳ G ( ε ′ )) ∈ G ( cl ( R id )) (Def. 1) ⇔ E G ∪ R ⊢ NDF δ G ⊳ G ( cl ( R id )) (Thm. 2) ⇔ E G ∪ R ⊢ NDF δ G ⊳ G ( R ) ( cl ( R id ) , ⊢ NDF ) • Proof of Corollary 1. b ). It follows immediately by the deﬁnition of bisim-ulation relations and according to the observations in Remark 7. (cid:3) In Figure 2 we brieﬂy summarize the results of the current section, namely,the algebraic encoding of the coalgebraic setting presented in [23].

5. A Decision Procedure for Bisimilarity in

CIRC

In this section, we describe how the coinductive theorem prover

CIRC [14] canbe used to implement the decision procedure for the bisimilarity of generalizedregular expressions, which we discussed above.19oalgebraic algebraic ⊢ ε : F ⊳ G E G ⊢ ε : F ⊳ G = true Exp F ⊳ G { ε ∈ T Σ , Exp | E G ⊢ ε : F ⊳ G = true } Exp G { ε ∈ T Σ , Exp | E G ⊢ ε : G ⊳ G = true } F ( Exp G ) { σ ∈ T Σ , ExpStruct | E G ⊢ σ ∈ F ( Exp G ) = true } δ F ⊳ G : Exp F ⊳ G → F ( Exp G ) δ ( ) : Ingredient Exp → ExpStruct E G ⊢ σ ∈ F ( Exp G ) = true , E G ⊢ σ ′ ∈ F ( Exp G ) = true ( σ, σ ′ ) ∈ F ( cl ( R id )) E G ∪ R ⊢ NDF σ = σ ′ if G = B or E G ⊢ NDF σ = σ ′ if G = B (Thm. 2) cl ( R id ) is a bisimulation E G ∪ R ⊢ NDF δ G ⊳ G ( R ) if G = B or E G ⊢ NDF δ G ⊳ G ( R ) if G = B (Cor. 1) Figure 2: non-deterministic functors - coalgebraic vs. algebraic approach

CIRC can be seen as an extension of Maude with behavioral features andits implementation is derived from that of Full-Maude. In order to use theprover, one needs to provide a speciﬁcation (a

CIRC theory) and a set of goals.A

CIRC theory B = ( S, (Σ , ∆) , ( E, I )) consists of an algebraic speciﬁcation( S, Σ , E ), a set ∆ of derivatives , and a set I of equational interpolants, whichare expressions of the form e ⇒ { e i | i ∈ I } where e and e i are equations. Theintuition for this type of expressions is simple: e holds whenever for any i in I theequation e i holds. In other words, to prove E ⊢ e one can chose to instead prove E ⊢ { e i | i ∈ I } . For the particular case of non-deterministic functors, we useequational interpolants to extend the initial entailment relation in a consistentway with rules (6)–(9). (For more information on equational interpolants see[7]). A derivative δ ∈ ∆ is a Σ-term containing a special variable ∗ : s ( i.e. , aΣ-context), where s is the sort of the variable ∗ . If e is an equation t = t ′ with t and t ′ of sort s , then δ [ e ] is δ [ t/ ∗ : s ] = δ [ t ′ / ∗ : s ]. We call this type of equationa derivable equation . The other equations are non-derivable . We write δ [ R ] torepresent { δ [ e ] | e ∈ R} , where R is a set of derivable equations, and ∆[ e ] forthe set { δ [ e ] | δ ∈ ∆ appropriate for e } .Moreover, note that CIRC works with an extension of the entailment relation ⊢ over frozen equations (introduced in Section 4), with two more axioms, asin [17]: E ∪ R ⊢ e iﬀ E ⊢ e (10) E ∪ R ⊢ G implies E ∪ δ [ R ] ⊢ δ [ G ] for each δ ∈ ∆ (11)Above, E ranges over unfrozen equations, e over non-derivable unfrozen20quations, and R , G over derivable frozen equations. Remark 8.

Note that the new entailment ⊢ NDF extended over frozen equations(in Deﬁnition 6) satisﬁes the assumptions (10) and (11).

CIRC implements the coinductive proof system given in [17] using a set ofreduction rules of the form ( B , F , G ) ⇒ ( B , F ′ , G ′ ), where B represents a speci-ﬁcation, F is the coinductive hypothesis (a set of frozen equations) and G is thecurrent set of goals. The freezing operator is deﬁned as described in Section 4.Here is a brief description of these rules: [Done] : ( B , F , {} ) ⇒ · Whenever the set of goals is empty, the system terminates with success. [Reduce] : ( B , F , G ∪ { e } ) ⇒ ( B , F , G ) if B ∪ F ⊢ e If the current goal is a ⊢ -consequence of B ∪ F then e is removed fromthe set of goals. [Derive] : ( B , F , G ∪ { e } ) ⇒ ( B , F ∪ { e } , G ∪ ∆[ e ] ) if B ∪ F 6⊢ e When the current goal e is derivable and it is not a ⊢ -consequence, it isadded to the hypothesis and its derivatives to the set of goals. [Simplify] : ( B , F , G ∪ { θ ( e ) } ) ⇒ ( B , F , G ∪ { θ ( e i ) | i ∈ I } ) if e ⇒ { e i | i ∈ I } is an equational interpolant from thespeciﬁcation and θ : X → T Σ ( Y ) is a substitution. [Fail] : ( B , F , G ∪ { e } ) ⇒ failure if B ∪ F 6⊢ e ∧ e is non-derivable This rule stops the reduction process with failure whenever the currentgoal e is non-derivable and is not a ⊢ -consequence of B ∪ F .It is worth noting that there is a strong connection between a

CIRC proofand the construction of a bisimulation relation. We illustrate this fact and theimportance of the freezing operator with a simple example.

Example 2.

Consider the case of inﬁnite streams. The set B ω of inﬁnitestreams over a set B is the ﬁnal coalgebra of the functor S = B × Id , with acoalgebra structure given by hd and tl, the functions that return the head andthe tail of the stream, respectively. Our purpose is to prove that ∞ = (00) ∞ .Let z and zz represent the stream on the left hand side and, respectively, on theright hand side. These streams are deﬁned by the equations: hd ( z ) = 0 , tl ( z ) = z, hd ( zz ) = 0 , tl ( zz ) = 0: zz . Note that equations over B like hd ( z ) = 0 are notderivable and equations over streams like tl ( z ) = z are derivable.In Fig. 3 we present the correlation between the CIRC proof and the con-struction of the bisimulation relation. Note how

CIRC collects the elements ofthe bisimulation as frozen hypotheses.Let us analyze what would happen if the freezing operator − were not used.Suppose the circular coinduction algorithm would add the equation z = zz inits unfrozen form to the hypotheses. After applying the derivatives we obtain IRC proof Bisimulation construction (add goal z = zz .) z zz ( zz ) ′ B , {} , { z = zz } ) F = {} ; z ∼ zz ? [Derive] −→ B , { z = zz } , ( hd ( z ) = hd ( zz ) tl ( z ) = tl ( zz ) )! F = { ( z, zz ) } ; z −→ zzz −→ ( zz ) ′ [Reduce] −→ ( B , { z = zz } , { z = 0: zz } ) F = { ( z, zz ) } ; z ∼ ( zz ) ′ ? [Derive] −→ B , n z = zzz = zz o , ( hd ( z ) = hd (0: zz ) tl ( z ) = tl (0: zz ) )! F = { ( z, zz ) , ( z, ( zz ) ′ ) } ; z −→ z ( zz ) ′ −→ zz [Reduce] −→ (cid:16) B , n z = zzz = zz o , {} (cid:17) F = { ( z, zz ) , ( z, ( zz ) ′ ) } X Figure 3: Parallel between a

CIRC proof and the bisimulation construction the goals hd ( z ) = hd ( zz ) , tl ( z ) = tl ( zz ) . At this point, the prover could use thefreshly added equation z = zz , and according to the congruence rule, both goalswould be proven directly, though we would still be in the process of showing thatthe hypothesis holds. By following a similar reasoning, we could also prove that ∞ = 1 ∞ ! In order to avoid these situations, the hypotheses are frozen, (i.e.,their sort is changed from Stream to Frozen ) and this stops the application ofthe congruence rule, forcing the application of the derivatives according to theirdeﬁnition in the speciﬁcation. Therefore, the use of the freezing operator is vitalfor the soundness of circular coinduction.

Next, we focus on using

CIRC for automatically reasoning on the equivalenceof G -expressions. As we will show, the implementation of both the algebraicspeciﬁcations associated to non-deterministic functors and the equational en-tailment relation described in Section 4 is immediate. Given a non-deterministicfunctor G , we deﬁne a CIRC theory B G = ( S, (Σ , ∆) , ( E, I )) as follows: • ( S, Σ , E ) is E G • ∆ = { δ G ⊳ G ( ∗ : Exp ) } , so the only derivable equations are those of sort Exp .As we have already seen for the example of streams, equations of sort

Slt must not be derivable. Since we have the subsort relation

Slt < Exp , weavoid the application of the derivative δ G ⊳ G ( ∗ : Exp ) over equations of sort

Slt by means of an interpolant (see below). • I consists of the following equational interpolants , whose role is to replace22urrent proof obligations over non-trivial structures with simpler ones: h σ , σ i = h σ ′ , σ ′ i ⇒ { σ = σ ′ , σ = σ ′ } (12) k i ( σ ) = k i ( σ ′ ) ⇒ { σ = σ ′ } (13) f = g ⇒ { f ( a ) = g ( a ) | a ∈ A } (14) ∪ i ∈ ,n { σ i } = ∪ j ∈ ,m { σ ′ j } ⇒ {∧ i ∈ ,n ( ∨ j ∈ ,m σ i = σ ′ j ) ∧ j ∈ ,m ( ∨ i ∈ ,n σ i = σ ′ j ) } (15)together with an equational interpolant t = t ′ ⇒ { t ≃ t ′ = true } (16)where ≃ is the equality predicate equationally deﬁned over the sort Slt .The last interpolant transforms the equations of sort

Slt from derivable(because of the subsort relation

Slt < Exp ) into non-derivable and equiv-alent ones.The interpolants (12–16) in I extend the entailment relation ⊢ NDF (intro-duced in Deﬁnition 6) as follows: E ⊢ NDF { e i | i ∈ I } E ⊢ NDF e if e ⇒ { e i | i ∈ I } in I Theorem 3 (Soundness).

Let G be a non-deterministic functor, and G a bi-nary relation on the set of G -expressions.If ( B G , F = {} , G = G ) ∗ ⇒ ( B G , F n , G n = {} ) using [Reduce] , [Derive] and [Simplify] , then G ⊆∼ G . Proof.

The idea of the proof is to ﬁnd a bisimulation relation e F s.t. G ⊆ e F .First let F represent the set of hypotheses (or derived goals) collected duringthe proof session. We distinguish between two cases:a) G = B . For this case, the set of expressions in G is given by the followinggrammar: ε :: = ∅ | b | ε ⊕ ε | µx.ε . (17)Note that the goals ε = ε ′ in G are proven1. either according to [Simplify] , applied in the context of the equationalinterpolant (16). If this is the case, then ε = ε ′ holds by reﬂexivity,therefore B G ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) (18)also holds; 23. or after the application of [Derive] , case in which B G ∪ F ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) holds. Moreover, note that δ B ⊳ B ( ε ) and δ B ⊳ B ( ε ′ )are reduced to b , respectively b ′ ∈ B , according to (17) and the def-inition of δ B ⊳ B . Consequently, the non-derivable (due to the subsortrelation B < Slt ) goal b = b ′ holds by reﬂexivity, so the followingis a sound statement: B G ⊢ NDF δ B ⊳ B ( ε ) = δ B ⊳ B ( ε ′ ) . (19)Based on (18), (19) and Corollary 1.b), we conclude that e F = cl ( G id ) is abisimulation, hence G ⊆ cl ( G id ) ⊆ ∼ G .b) G = B . Based on the reduction rules implemented in CIRC , it is quite easyto see that the initial set of goals G is a ⊢ NDF -consequence of B G ∪ F . Inother words, G ⊆ cl ( F id ). So, if we anticipate a bit, we should show that e F = cl ( F id ) is a bisimulation, i.e. , according to Corollary 1, B G ∪ F ⊢ NDF δ G ⊳ G ( F ) . This is achieved by proving that B G ∪ F ⊢ NDF G i ( i ∈ , n )(note that δ G ⊳ G ( F ) ⊆ S i ∈ ,n G i , according to [ Derive]). The proof is byinduction on j , where n − j is the current proof step, and by case analysison the CIRC reduction rules applied at each step.We further provide a sketch of the proof.The base case j = n follows immediately, as B G ∪ F ⊢ NDF G n = ∅ .For the induction step we proceed as follows. Let e ∈ G j . If e ∈ G j +1 then B G ∪ F ⊢ NDF e by the induction hypothesis. If e

6∈ G j +1 then,for example, if [Reduce] was applied then it holds that B G ∪ F j ⊢ NDF e .Recall that F j ⊆ F , so B G ∪ F ⊢ NDF e also holds. The result followsin a similar fashion for the application of [Derive] or [Simplify] . (cid:3) Remark 9.

The soundness of the proof system we describe in this paper doesnot follow directly from Theorem 3 in [17]. This is due to the fact that we donot have an experiment-based deﬁnition of bisimilarity. So, even though themechanism we use for proving B G ∪ F ⊢ NDF δ G ⊳ G ( F ) (for the case G = B ) issimilar to the one described in [17], the current soundness proof is conceived interms of bisimulations (and not experiments). Remark 10.

The entailment relation ⊢ NDF that

CIRC uses for checking theequivalence of generalized regular expressions is an instantiation of the paramet-ric entailment relation ⊢ from the proof system in [17]. This approach allows CIRC to reason automatically on a large class of systems which can be modeledas non-deterministic coalgebras.

As already stated, our ﬁnal goal is to use

CIRC as a decision procedure for thebisimilarity of generalized regular expressions. That is, whenever provided a set24f expressions, the prover stops with a yes/no answer w.r.t. their equivalence.In this context, an important aspect is that the sub-coalgebra generated by anexpression ε ∈ Exp G by repeatedly applying δ G is, in general, inﬁnite. Take forexample the non-deterministic functor S = B × Id associated to inﬁnite streams,and consider the property µx. ∅ ⊕ r h x i = µx.r h x i . In order to prove this, CIRC builds an inﬁnite proof sequence by repeatedly applying δ S as follows: δ S ( µx. ∅ ⊕ r h x i ) = δ S ( µx.r h x i ) ↓h , ∅ ⊕ ( µx. ∅ ⊕ r h x i ) i = h , µx.r h x ii δ S ( ∅ ⊕ ( µx. ∅ ⊕ r h x i )) = δ S ( µx.r h x i ) ↓h , ∅ ⊕ ∅ ⊕ ( µx. ∅ ⊕ r h x i ) i = h , µx.r h x ii [. . .]In this case, the prover would never stop. We observed in Section 3 that The-orem 1 guarantees we can associate a ﬁnite coalgebra to a certain expression.In the proof of the aforementioned theorem, which is presented in [23], it isshown that the axioms for associativity, commutativity and idempotence (ACI)of ⊕ guarantee ﬁniteness of the generated sub-coalgebra (note that these axiomshave also been proven sound w.r.t. bisimulation). ACI properties can easily bespeciﬁed in CIRC as the prover is an extension of Maude, which has a powerfulmatching modulo ACUI (ACI plus unity) capability. The idempotence is givenby the equation ε ⊕ ε = ε , and the commutativity and associativity are speciﬁedas attributes of ⊕ . It is interesting to remark that for the powerset functortermination is guaranteed without the axioms, because the coalgebra structureon the expressions for the powerset functor already includes ACI (since P ω ( Exp )is itself a join-semilattice).

Theorem 4.

Let G be a set of proof obligations over generalized regular expres-sions. CIRC can be used as a decision procedure for the equivalences in G , thatis, it can decide whenever a goal ( ε , ε ) ∈ G is a true or false equality. Proof.

Note that as proven in [23], the ACI axioms for ⊕ guarantee that δ G is applied for a ﬁnite number of times in the generation of the sub-coalgebraassociated to a G -expression. Therefore, it straightforwardly follows that byimplementing the ACI axioms in CIRC (as attributes of ⊕ ), the set of newgoals obtained by applying δ G is ﬁnite. In these circumstances, whenever CIRC stops according to the reduction rule [Done] , the initial proof obligations arebisimilar. On the other hand, whenever it terminates with [Fail] , the goals arenot bisimilar. (cid:3)

6. A

CIRC -based Tool

We have implemented a tool that, when provided with a functor G , auto-matically generates a speciﬁcation for CIRC which can then be used in order toautomatically check whether two G -expressions are bisimilar. The tool is imple-mented as a metalanguage application in Maude. It can be downloaded from25he address http://goriac.info/tools/functorizer/ . In order to start thetool, one needs to launch Maude along with the extension Full-Maude and loadthe downloaded ﬁle using the command in functorizer.maude . The general use case consists in providing the join-semilattices, the alphabetsand the expressions. After these steps, the tool automatically checks if theprovided expressions are guarded, closed and correctly typed. If this checksucceeds, then it outputs a speciﬁcation that can be further processed by

CIRC .In the end, the prover outputs either the bisimulation, if the expressions areequivalent, or a negative answer, otherwise.We present two case studies in order to emphasize the high degree of gener-ality for the types of systems we can handle, and show how the tool is used.

Example 3.

We consider the case of Mealy machines, which are coalgebras forthe functor ( B × Id ) A .Formally, a Mealy machine is a pair ( S, α ) consisting of a set S of statesand a transition function α : S → ( B × S ) A , which for each state s ∈ S andinput a ∈ A associates an output value b and a next state s ′ . Typically, we write α ( s )( a ) = ( b, s ′ ) ⇔ s a | b s ′ .In this example and in what follows we will consider for the output the two-value join-semilatice B = { , } (with ⊥ B = 0 ) and for the input alphabet A = { a, b } . The expressions for Mealy machines are given by the grammar: E :: = ∅ | x | E ⊕ E | µx.E | a ( r h E i ) | b ( r h E i ) | a ( l h E i ) | b ( l h E i ) E :: = ∅ | E ⊕ E | | E :: = ∅ | E ⊕ E | µx.E | a ( r h E i ) | b ( r h E i ) | a ( l h E i ) | b ( l h E ) Intuitively, an expression of shape a ( l h E i ) speciﬁes a state that for an input a has an output value speciﬁed by E . For example, the expression a ( l h i ) speciﬁes a state that for input a outputs , whereas in the case of a ( l h∅i ) theoutput is . An expression of shape a ( r h E i ) speciﬁes a state that for a certaininput a has a transition to a new state represented by E . For example, theexpression µx.a ( r h x i ) states that for input a , the machine will perform a “ a -loop” transition, whereas a ( r h∅i ) states that for input a there is a transition tothe state denoted by ∅ . It is interesting to note that a state will only be fullyspeciﬁed in what concerns transitions and output (for a given input a if both a ( l h E i ) and a ( r h E i ) appear in the expression (combined by ⊕ ). In the caseonly transition (resp. output) are speciﬁed, the underspeciﬁcation is solved bysetting the target state (resp. output) to ∅ (resp. ⊥ B = 0 ). Next, to provide the reader with intuition, we will explain how one can rea-son on the bisimilarity of two simple expressions, by constructing bisimulationrelations. Later on, we show how

CIRC can be used in conjunction with ourtool in order to act as a decision procedure when checking equivalence of twoexpressions, in a fully automated manner.We will start with the expressions ε = µx.a ( r h x i ) and ε = ∅ .We have tobuild a bisimulation relation R on G -expressions, such that ( ε , ε ) ∈ R . We26o this in the following way: we start by taking R = { ( ε , ε ) } and we checkwhether this is already a bisimulation, by considering the output values andtransitions and check whether no new expressions appear in this process. Ifnew pairs of expressions appear we add them to R and repeat the process.Intuitively, this can be represented as follows: ε a | R b | ε a | b | R = { ( ε , ε ) } ε a | ,b | ε R ε ε R ; add it a | ,b | R = { ( ε , ε ) , ( ε , ε ) } ε a | ,b | ε a | ,b | R X Figure 4: Bisimulation construction

In the ﬁgure above, and as before, we use the notation ε R ε to denote( ε , ε ) ∈ R . As illustrated in Figure 4, R = { ( ε , ε ) , ( ε , ε ) } is closed undertransitions and is therefore a bisimulation. Hence, ε ∼ G ε .The proved equality ∅ = µx.a ( r h x i ) might seem unexpected, if the reader isfamiliar with labelled transition systems. The equality is sound because theseare expressions specifying behavior of a Mealy machine and, semantically, bothdenote the function that for every non-emtpy word outputs 0 (the semantics ofMealy machines is given by functions B A + , intuitively one can think of these ex-pressions as both denoting the empty language). This is visible if one draws theautomata corresponding to both expressions (say, for simplicity, the alphabet is A = { a } ): ∅ a | µx.a ( r h x i ) a | Note that (i) the ∅ expression for Mealy machines is mapped with δ to a functionthat for input a gives h , ∅i , which represents a state with an a -loop to itself andoutput 0; (ii) the second expression speciﬁes explicitly an a -loop to itself andit also has output 0, since no output value is explicitly deﬁned. Now, also notethat similar expressions for labelled transition systems (LTS), or coalgebras ofthe functor P ω ( − ) A , would not be bisimilar since one would have an a-transitionand the other one not. This is because the ∅ expression for LTS really denotesa deadlock state. In operational terms they would be converted to the systems ∅ µx.a ( x ) a which now have an obvious diﬀerence in behavior.27y performing a similar reasoning as in the example above one can show thatthe expressions ε = µx.a ( r h x i ) ⊕ b ( r h x i ) and ε = µx.a ( r h x i ) are bisimilar, andthe bisimulation relation is built as illustrated in Figure 5: ε a | R b | ε a | b | R = { ( ε , ε ) } ε a | ,b | ε R ε ∅ not yet in R ; add it a | ,b | R = { ( ε , ε ) , ( ε , ∅ ) } ε a | ,b | ∅ a | ,b | R X Figure 5: Bisimulation construction

Let us further consider the Mealy machine depicted in Figure 6, where allstates are bisimilar. s a | b | a | b | b | a | s b | a | Figure 6: Mealy machine: s ∼ s We show how to check the equivalence of two expression characterizing thestates s and s , in a fully automated manner, using CIRC . These expressionsare ε = µx.b ( l h i ) ⊕ b ( r h ε i ) ⊕ a ( µy.a ( r h y i ) ⊕ b ( r h ε i ) ⊕ b ( l h i )) and ε = µx.b ( l h i ) ⊕ b ( r h x i ) ⊕ a ( r h x i ), respectively.In order to check bisimilarity of ε and ε we load the tool and deﬁne thesemilattice B = { , } and the alphabet A = { a, b } : (jslt B is 0 1 bottom 0 . 0 v 0 = 0 . 0 v 1 = 1 . 1 v 1 = 1 . endjslt)(alph A is a b endalph) We provide the functor G using the command (functor (B x Id) ^ A .) . Thecommand (set goal ... .) speciﬁes the goal we want to prove: (set goal\mu X:FixpVar . b(l<1>) (+) a(l<0>) (+) b(r) (+)a(r) =\mu X:FixpVar . b(l<1>) (+) b(<\mu X:FixpVar . b(l<1>) (+)b(r) (+) a(r)>) (+)a(\mu Y:FixpVar . a(r) (+)b(<\mu X:FixpVar . b(l<1>) (+) a(l<0>) (+)b(r) (+) a(r)>) (+) b(l<1>)) .)

In order to generate the

CIRC speciﬁcation we use the command (generate oalgebra .) . Next we need to load CIRC along with the resulting speciﬁcationand start the proof engine using the command (coinduction .) .As already shown, behind the scenes,

CIRC builds a bisimulation relationthat includes the initial goal. The proof succeeds and the output consists of (asubset of) this bisimulation:

Proof succeeded.Number of derived goals: 2Number of proving steps performed: 50Maximum number of proving steps is set to: 256Proved properties:- phi (+) (\mu X . a(l<0>) (+) a(r) (+) b(l<1>) (+) b(r)) =phi (+) (\mu Y . a(r) (+) b(l<1>) (+)b(r<\mu X . a(l<0>) (+) a(r) (+) b(l<1>)(+)b(r)>))- \mu X . a(l<0>) (+) a(r) (+) b(l<1>) (+) b(r) =\mu Z . a(r<\mu Y . a(r) (+) b(l<1>) (+)b(r<\mu X . a(l<0>) (+) a(r) (+) b(l<1>) (+) b(r)>)>) (+)b(l<1>) (+) b(r<\mu X . a(l<0>) (+) a(r) (+)b(l<1>) (+) b(r)>)

For the ease of understanding, here we printed a readable version of theproved properties. In Section 6.1, however, we show that internally each ex-pression is brought to a canonical form by renaming the variables. Moreover,note that in our tool, ∅ is represented by the constant phi . All the examplesprovided in the current section make use of this convention.As previously mentioned, CIRC is also able to detect when two expressionsare not equivalent. Take, for instance, the expressions µx.a ( l h i ) ⊕ a ( r h a ( l h i ) ⊕ a ( r h x i ) i ) and a ( l h i ) ⊕ a ( r h a ( r h µx.a ( r h x i ) ⊕ a ( l h i ) i ) ⊕ a ( l h i ) i ), characterizingthe states s and s from the Mealy machines in Fig. 7. After following somesteps similar to the ones previously enumerated, the proof fails and the outputmessage is Visible goal [...] failed during coinduction . s a | s a | s a | s a | s a | Figure 7: Mealy machines: s s Example 4.

Let us show how one may check strong bisimilarity of two nonde-terministic processes of a non-trivial CCS-like language with termination, dead-lock, and divergence, as studied in [1]. A process is a guarded, closed termdeﬁned by the following grammar: P :: = X | δ | Ω | a.P | P + P | x | µx.P (20) where: X is the constant for successful termination, • δ denotes deadlock, • Ω is the divergent computation ( i.e. , the undeﬁned process), • a.P is the process executing the action a and then continuing as the process P , for any action a from a given set A , • P + P is the non-deterministic process behaving as either P or P , and • µx.P is the recursive process P [ µx.P/x ] .In [23] is is shown that, up to strong bisimilarity, the above syntax of pro-cesses is equivalent to the canonical set of (guarded, closed) regular expressionsderived for the functor ✸ + P ω ( Id ) A , E :: = ∅ | E ⊕ E | x | µx.E | l [ E ] | r [ E ] E :: = ∅ | E ⊕ E | E :: = ∅ | E ⊕ E | a ( E ) E :: = ∅ | E ⊕ E | { E } The translation map ( − ) † from processes to expressions is deﬁned by induc-tion on the structure of the process: ( X ) † = l [1] ( a.P ) † = r [ a ( { P † } )]( δ ) † = r [ ∅ ] ( P + P ) † = ( P ) † ⊕ ( P ) † (Ω) † = ∅ ( µx.P ) † = µx.P † x † = x . Consider now two processes P and Q over the alphabet A = { a, b } : P = µx. ( a.x + a.P + b.b. X + b. ( δ + Ω)) Q = µz. ( a.z + b. ( δ + b. X ) + b.δ ) where P = µy. ( a. ( y + δ ) + b.δ + b. ( δ + b. X ) + δ ) . Graphically, the two processescan be represented by the following labelled transition systems (for simplicity weomit annotating states with information regarding the satisﬁability of successfultermination, divergence, and deadlock): P a ba b Q ab b P abb b bb Figure 8: Nondeterministic processes: Q ∼ P e want to check if the process P is strongly bisimilar to the process Q . Byusing the above translation, process P is represented by the expression µx. ( r [ a ( { µy. ( r [ a ( { y ⊕ r [ ∅ ] } )] ⊕ r [ b ( { r [ ∅ ] } )] ⊕ r [ b ( { r [ ∅ ] ⊕ r [ b ( { l [1] } )] } )] ⊕ r [ ∅ ]) } )] ⊕ r [ a ( { x } )] ⊕ r [ b ( { r [ b ( { l [1] } )] } )] ⊕ r [ b ( { r [ ∅ ] ⊕ ∅} )]) whereas process Q is represented by the expression µz. ( r [ a ( { z } )] ⊕ r [ b ( { r [ ∅ ] ⊕ r [ b ( { l [1] } )] } )] ⊕ r [ b ( { r [ ∅ ] } )]) . In order to use the tool, one needs to specify the semilattice, the alphabet,the functor, and the goal in a manner similar to the one previously presented: (jslt B is 1 bottom 1 . 1 v 1 = 1 . endjslt)(alph A is a b endalph)(functor B + (P Id) ^ A .)(set goal \mu X:FixpVar .r[ a( { X:FixpVar } ) ] (+)r[ a( { \mu Y:FixpVar .r[ a( { Y:FixpVar (+) r[ phi ] } ) ] (+)r[ b( { r[ phi ] } ) ] (+)r[ b( { r[ phi ] (+) r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ phi ] } )] (+)r[ b( { r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ b( { r[ phi ] (+) phi } ) ]=\mu Z:FixpVar .r[ a( { Z:FixpVar } ) ] (+)r[ b( { r[ phi ] (+) r[ b( { l[ 1 ] } ) ] } ) ] (+)r[ b( { r[ phi ] } ) ] .) For the generated speciﬁcation

CIRC terminates and outputs a positive result:

Proof succeeded.Number of derived goals: 15Number of proving steps performed: 58Maximum number of proving steps is set to: 256Proved properties:- r[phi] (+) (\mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+) r[b( { r[phi] } )](+) r[b( { r[phi] (+) r[b( { l[1] } )] } )])=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )]- r[b( { l[1] } )] = r[phi] (+) r[b( { l[1] } )]- \mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+) r[b( { r[phi] } )] (+)r[b( { r[phi] (+) r[b( { l[1] } )] } )]=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )]- \mu X. r[a( { X } )] (+) r[a( { \mu Y. r[phi] (+) r[a( { r[phi] (+) Y } )] (+)r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )] } )] (+) [b( { r[phi] + phi } )] (+) r[b( { r[b( { l[1] } )] } )]=\mu Z. r[a( { Z } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+) r[b( { l[1] } )] } )] In this section we present details on the implementation of the algebraicspeciﬁcation given in Section 4, based on the examples from Section 6.In order to generate the algebraic speciﬁcations for

CIRC when provided afunctor and two expressions we used the Maude system [4]. We choose it forits suitability for performing equational and rewriting logic based computations,and because of its reﬂective properties allowing for the development of advancedmetalanguage applications. As the technical aspects on how to work at themeta-level are beyond the scope of this paper, we refrain from presenting themand show, instead, what the generated speciﬁcations consist of.Most of the algebraic speciﬁcations from Section 4 have a straightforwardimplementation in Maude. Consider, for instance, the case of Mealy machinespresented in Example 3. The generated grammars for functors (1) and expres-sions (Deﬁnition 3) are coded as: sort Functor . sorts Exp ExpStruct Alph Slt .sorts AlphName SltName . subsort Exp < ExpStruct .subsort SltName < Functor . enum A is a b . enum B is 0 1 .subsort A < Alph .op A : -> AlphName . subsort B < Slt .op B : -> SltName .op G : -> Functor . op _‘(+‘)_ : Exp Exp -> Exp .op Id : -> Functor . op _‘(_‘) : Alph Exp -> Exp .op _+_ : Functor Functor -> Functor . op \mu_._ : FixpVar Exp -> Exp .op _^_ : Functor AlphName -> Functor . ops l<_> r<_> : Exp -> Exp .op _x_ : Functor Functor -> Functor . op phi : -> Exp .eq G = (B x Id) ^ A .

Most of the syntactical constructs are Maude-speciﬁc: sorts and subsort declare the sorts we work with and, respectively, the relations between them; op declares operators; eq declares equations (the equation in our case deﬁnesthe shape of the functor G ). The only CIRC -speciﬁc construct, enum , is syntacticsugar for declaring enumerable sorts, i.e. , sorts that consist only of the speciﬁedconstants. As a side note, if brackets ( ( , [ , { ) are used in the declaration of anoperation, then they must be preceded by a backquote ( ‘ ).As mentioned in Section 2, in order to guarantee the ﬁniteness of our proce-dure, one needs to include the ACI axioms for (+) . Moreover, we have observedthat the unity axiom for (+) plays an important role in decreasing the numberof states generated by the repeated application of δ G , therefore improving theoverall time performance of the tool. For example, the number of rewritings CIRC performed in order to prove the bisimilarity of ε and ε in Figure 5 washalved when the unity axiom was used.By turning on the axiomatization ﬂag using the command (axioms on .) ,the following code is generated: 32 p _‘(+‘)_ : Exp Exp -> Exp [assoc comm] .eq E:Exp (+) E:Exp = E:Exp .eq E:Exp (+) phi = E:Exp . It is an obvious question why not to add other axioms to the tool, since theunity axiom has improved performance. At this stage we do not have studied indetail how much adding other axioms would help. It is in any case a trade-oﬀ onhow many extra axioms one should include, which will get the automaton pro-duced from an expression closer to the minimal automaton, and how much timethe tool will take to reduce the expressions in each step modulo the axioms. Forclassical regular expressions, there is an interesting empirical study on this [16].We leave it as future work to carry on a similar study for our expressions andaxioms.The process of substituting ﬁxed-point variables has a natural implementa-tion. We present the equations handling the basic expressions ∅ and x , and theoperation (+) : op _‘[_/_‘] : Exp Exp FixpVar -> Exp .eq phi [ E:Exp / X:FixpVar ] = phi .ceq Y:FixpVar [ E:Exp / X:FixpVar ] = E:Exp if (X:FixpVar == Y:FixpVar) .eq Y:FixpVar [ E:Exp / X:FixpVar ] = Y:FixpVar [owise] .eq (E1:Exp (+) E2:Exp) [ E:Exp / X:FixpVar ] =(E1:Exp [E:Exp / X:FixpVar]) (+) (E2:Exp [E:Exp / X:FixpVar]) . In order to avoid matching problems and to overpass the fact that in Maudeone cannot handle an equation that has fresh variables in its right-hand-side( i.e. , they do not appear in the left-hand-side), we replace expression variableswith parameterized constants: op var : Nat -> FixpVar .

The operation thatobtains this canonical form has an inductive deﬁnition on the structure of thegiven expression and makes use of the substitution operation presented above.For this reason, the bisimulation

CIRC builds contains parameterized constantsinstead of the user declared variables. The property proved in Example 4 is,therefore, written as: \mu var(2) . r[a( { var(2) } )] (+) r[a( { \mu var(1) . r[phi] (+)r[a( { r[phi] (+) var(1) } )] (+) r[b( { r[phi] } )] (+) r[b( { r[phi] (+)r[b( { l[1] } )] } )] } )] (+) r[b( { r[phi] (+) phi } )] (+) r[b( { r[b( { l[1] } )] } )]=\mu var(1) . r[a( { var(1) } )] (+) r[b( { r[phi] } )] (+)r[b( { r[phi] (+) r[b( { l[1] } )] } )] The most important part of the algebraic speciﬁcation consists of the equa-tions deﬁning the operations δ ( ), Plus ( , ), and Empty . Most of these equa-tions are implemented as presented in [23]. The only diﬃculties we encoun-tered were for the exponentiation case, as Maude does not handle higher-orderfunctions. Without entering into details, as a workaround, we introduced anew sort

Function < ExpStruct and an operation \ . : ExpoCase Alph FunctorExpStruct -> Function in order to emulate function-passing. The ﬁrst argumentis used to memorize the origin where the exponentiation ingredient is encoun-tered: δ , Plus , or

Empty . Its purpose is purely technical – we use it in order to33void some internal matching problems. The other three parameters are thoseof the structured expression λ. ( a, F ⊳ G , σ ) presented in Section 4: a letter in thealphabet, an ingredient, and some other structured expression.Another thing worth describing is the way we enable CIRC to prove equiva-lences when the powerset functor occurs. Namely, we present how interpolant(15) is implemented. Recall that we want to show that two sets of expressionsare equivalent, which means that for each expression in the ﬁrst set there mustbe an equivalent one in the second set and vice-versa.In order to handle sets of structured expressions we introduce a new sort,

ExpStructSet as a supersort for

ExpStruct . We also consider the set separator , : ExpStructSet ExpStructSet -> ExpStructSet [assoc,comm] , the empty set emptyS : -> ExpStructSet , and the set wrapping operation { } : ExpStructSet-> ExpStruct . In order to mimic universal quantiﬁcation over a set, we use aspecial constant referred to as token “ [/] ”. In what follows, we consider two vari-ables of sort

ExpStructSet : ES and ES’ , and two variables of sort

ExpStructSet : ESS and

ESS’ . We now describe the process of ﬁnding the equivalence betweentwo sets: • whenever encountering two wrapped expression sets we add the universalquantiﬁcation token to each of them in two distinct goals: srl { ESS } = { ESS’ } => { [/] ESS } = { ESS’ } /\ { ESS } = { [/] ESS’ } . • iterate through the expressions on the left-hand-side (similarly for theother direction): srl { [/] (ES , ESS) } = { ESS’ } => { [/] ES } = { ESS’ } /\ { [/] ESS } = { ESS’ } .srl { ESS } = { [/] (ES’ , ESS’) } => { ESS } = { [/] ES’ } /\ { ESS } = { [/] ESS’ } . • when left with one expression on the left-hand-side, start iterating throughthe expressions on the right-hand-side until ﬁnding an equivalence (simi-larly for the other direction): srl { [/] ES } = { ES’ , ESS’ } => ES = ES’ \/ { [/] ES } = { ESS’ } .srl { ES , ESS } = { [/] ES’ } => ES = ES’ \/ { ESS } = { [/] ES’ } . • if no equivalence has been found, transform the current goal into a visiblefailure: srl { ESS } = emptyS => true = false .srl emptyS = { ESS } => true = false . Finally, the type checker for structured expressions has a straightforwardimplementation. Its code does not appear in the generated speciﬁcation as itis only used when the tool receives the expressions as input. This preventsobtaining the speciﬁcation and starting the prover in case invalid expressionsare provided. 34 . Discussion

One of the major contributions of this paper is that we provided a decisionprocedure for the bisimilarity of generalized regular expressions. In order toenable the implementation of the decision procedure, we have exploited an en-coding of coalgebra into algebra, and we formalized the equivalence between thecoalgebraic concepts associated to non-deterministic coalgebras [23] and theiralgebraic correspondents. This led to the deﬁnition of algebraic speciﬁcations( E G ) that model both the language and the coalgebraic structure of expressions.Moreover, we deﬁned an equational deduction relation ( ⊢ NDF ), used on thealgebraic side for reasoning on the bisimilarity of expressions.The most important result of the parallel between the coalgebraic and al-gebraic approaches is given in Corollary 1, which formalizes the deﬁnition ofthe bisimulation relations in algebraic terms. Actually, this result is the key forproving the soundness of the decision procedure implemented in the automatedprover

CIRC [14]. As a coinductive prover,

CIRC builds a relation F closed un-der the application of δ G with respect to ⊢ NDF ( E G ∪ F ⊢ NDF δ G ( F ) ), henceautomatically computing a bisimulation the initial proof obligations belong to.The approach we present in this paper enables CIRC to perform reasoningbased on bisimulations (instead of experiments [17]). This way, the proveris extended to checking bisimilarity in a large class of systems that can bemodeled as non-deterministic coalgebras. Note that the constructions aboveare all automated – the (non-trivial)

CIRC algebraic speciﬁcation describing E G , together with the interpolants implementing ⊢ NDF are generated with theMaude tool presented in Section 6.We now mention some of the existing coalgebraic based tools for provingbisimilarity and the main diﬀerences with the tool presented in this paper. Co-Casl [8] and CCSL [19] are tools that can generate proof obligations for theoremprovers from coalgebraic speciﬁcations. In [8] several tactics for interactive andautomatic bisimulation building are implemented in Isabelle/HOL and are usedto derive bisimilarities for translated speciﬁcations from CoCasl. The main dif-ference between our tool and CoCasl or CCSL is that, given a functor, the toolderives a speciﬁcation language for which equivalence is decidable (that is, it isautomatic and not interactive). CIRC [5, 17], on top of which the current toolis built, is based on hidden logic [18] and uses a partial decision procedure forproving bisimilarities via implicit construction of bisimulations. Our tool can beseen as an extension of CIRC to a fully automatic theorem prover for the classof non-deterministic coalgebras. We stress the fact that the focus of this paperwas on a language for which equivalence is decidable. Tools such as CoCasl,CCSL or

CIRC have a more expressive language, where one can, for instance,specify streams which in our language could not be speciﬁed (intuitively, thestreams we can specify in our language are eventually periodic). In those toolsdecidability of equivalence can however not be guaranteed.There are several directions for future work.Extending the class of systems to include quantitative coalgebras (such asweighted automata and Markov chains) will enlarge the scope of applicability of35he tool. The challenge in this extension arises from the fact that the deﬁnitionof expressions for quantitative coalgebras involving the distribution monad is notas modular as for the other functors (for details see [22]). This is a consequenceof the fact that the sum of two valid expressions might not be a valid expressionanymore (since in distributions we require that the sum of probabilities add upto 1). Moreover, calculating bisimulation relations in the quantitative settingwill encompass metric manipulation, which is currently not implemented in

CIRC .To improve usability, building a graphical interface for the tool is an obviousnext step. The graphical interface should ideally allow the speciﬁcation of ex-pressions by means of systems of equations (which are then solved internally) oreven by means of an automaton, which would then be translated to an expres-sion using Kleene’s theorem. We also would like to explore how adding moreaxioms than ACI to the prover (that is, each step of the bisimulation checking isperformed modulo more equations) improves the performance. Our experienceso far shows that by adding the axiom for the distribution of the ∅ expressionthrough the constructors, i.e. ∅ ⊕ ε = ε , the prover works signiﬁcantly faster.We have not yet studied complexity bounds for the algorithms presented inthis paper. We conjecture however that the bounds will be very similar to thealready known for classical regular expressions [13, 25]. Further explorations inthis direction are left as future work. Acknowledgments.

We would like to thank the referees for the many construc-tive comments, which greatly helped us to improve the paper. The authors arealso grateful for useful comments from Luca Aceto, Filippo Bonchi, and MiguelPalomino Tarjuelo. The work of Georgiana Caltais and Eugen-Ioan Goriac hasbeen partially supported by the project ‘Meta-theory of Algebraic Process The-ories’ (nr. 100014021) of the Icelandic Research Fund. The work of Eugen-IoanGoriac has also been partially supported by the project ‘Extending and Ax-iomatizing Structural Operational Semantics: Theory and Tools’ (nr. 110294-0061) of the Icelandic Research Fund. The work of Dorel Lucanu has beenpartially supported by the PNII grant DAK project Contract 161/15.06.2010,SMIS-CSNR 602-12516. The work of Alexandra Silva was partially funded byERDF - European Regional Development Fund through the COMPETE Pro-gramme and by Fundao para a Cincia e a Tecnologia, Portugal within projects

FCOMP-01-0124-FEDER-020537 and

SFRH/BPD/71956/2010 . ReferencesReferences [1] L. Aceto and M. Hennessy. Termination, deadlock, and divergence.

J.ACM , 39:147–187, January 1992.[2] M. Bonsangue, G. Caltais, E.-I. Goriac, D. Lucanu, J. Rutten, and A. Silva.A decision procedure for bisimilarity of generalized regular expressions. In36 roceedings of the 13th Brazilian conference on Formal methods: founda-tions and applications , SBMF’10, pages 226–241, Berlin, Heidelberg, 2011.Springer-Verlag.[3] A. Bouhoula, J.-P. Jouannaud, and J. Meseguer. Speciﬁcation and proof inmembership equational logic.

Theor. Comput. Sci. , 236(1-2):35–132, 2000.[4] M. Clavel, F. Dur´an, S. Eker, P. Lincoln, N. Mart´ı-Oliet, J. Meseguer, andC. Talcott.

All about Maude - a high-performance logical framework: howto specify, program and verify systems in rewriting logic . Springer-Verlag,Berlin, Heidelberg, 2007.[5] J. Goguen, K. Lin, and G. Rosu. Circular coinductive rewriting. In

ASE’00: Proceedings of the 15th IEEE international conference on Automatedsoftware engineering , pages 123–132, Washington, DC, USA, 2000. IEEEComputer Society.[6] J. A. Goguen. Order-sorted algebra I: Equational deduction for multipleinheritance, overloading, exceptions and partial operations.

TheoreticalComputer Science , 105:217–273, 1992.[7] E.-I. Goriac, D. Lucanu, and G. Ro¸su. Automating coinduction with caseanalysis. In

Proceedings of the 12th international conference on Formalengineering methods and software engineering , ICFEM’10, pages 220–236,Berlin, Heidelberg, 2010. Springer-Verlag.[8] D. Hausmann, T. Mossakowski, and L. Schr¨oder. Iterative Circular Coin-duction for CoCasl in Isabelle/HOL. In M. Cerioli, editor,

FASE , volume3442 of

Lecture Notes in Computer Science , pages 341–356. Springer, 2005.[9] C. Hermida and B. Jacobs. Structural induction and coinduction in aﬁbrational setting.

Inf. Comput. , 145(2):107–152, 1998.[10] S. Kleene. Representation of events in nerve nets and ﬁnite automata.

Automata Studies , pages 3–42, 1956.[11] D. Kozen. A completeness theorem for Kleene algebras and the algebra ofregular events. In

LICS , pages 214–225. IEEE Computer Society, 1991.[12] D. Kozen. Myhill-Nerode relations on automatic systems and the com-pleteness of Kleene algebra. In A. Ferreira and H. Reichel, editors,

STACS ,volume 2010 of

Lecture Notes in Computer Science , pages 27–38. Springer,2001.[13] D. Kozen. On the coalgebraic theory of Kleene algebra with tests. Techni-cal Report http://hdl.handle.net/1813/10173 , Computing and Infor-mation Science, Cornell University, March 2008.3714] D. Lucanu, E.-I. Goriac, G. Caltais, and G. Ro¸su. CIRC: a behavioral veriﬁ-cation tool based on circular coinduction. In

Proceedings of the 3rd interna-tional conference on Algebra and coalgebra in computer science , CALCO’09,pages 433–442, Berlin, Heidelberg, 2009. Springer-Verlag.[15] R. Milner. A complete inference system for a class of regular behaviours.

J. Comput. System Sci. , 28(3):439–466, 1984.[16] S. Owens, J. H. Reppy, and A. Turon. Regular-expression derivatives re-examined.

J. Funct. Program. , 19(2):173–190, 2009.[17] G. Ro¸su and D. Lucanu. Circular coinduction: a proof theoretical foun-dation. In

Proceedings of the 3rd international conference on Algebra andcoalgebra in computer science , CALCO’09, pages 127–144, Berlin, Heidel-berg, 2009. Springer-Verlag.[18] G. Rosu.

Hidden Logic . PhD thesis, University of California at San Diego,2000.[19] J. Rothe, H. Tews, and B. Jacobs. The coalgebraic class speciﬁcationlanguage CCSL.

J. UCS , 7(2):175–193, 2001.[20] J. J. M. M. Rutten. Universal coalgebra: a theory of systems.

Theor.Comput. Sci. , 249(1):3–80, 2000.[21] A. Salomaa. Two complete axiom systems for the algebra of regular events.

J. ACM , 13(1):158–169, 1966.[22] A. Silva, F. Bonchi, M. Bonsangue, and J. Rutten. Quantitative Kleenecoalgebras.

Information and Computation , 209(5):822–849, 2011.[23] A. Silva, M. M. Bonsangue, and J. J. M. M. Rutten. Non-deterministicKleene coalgebras.

Logical Methods in Computer Science , 6(3), 2010.[24] S. Staton. Relating coalgebraic notions of bisimulation: with applicationsto name-passing process calculi. In

Proceedings of the 3rd internationalconference on Algebra and coalgebra in computer science , CALCO’09, pages191–205, Berlin, Heidelberg, 2009. Springer-Verlag.[25] J. Worthington. Automatic proof generation in Kleene algebra. InR. Berghammer, B. M¨oller, and G. Struth, editors,

RelMiCS , volume 4988of