Fixpoint Theory -- Upside Down
aa r X i v : . [ c s . L O ] J a n Fixpoint Theory – Upside Down
Paolo Baldan , Richard Eggert , Barbara K¨onig , and Tommaso Padoan Universit`a di Padova, Italy Universit¨at Duisburg-Essen, Germany
Abstract.
Knaster-Tarski’s theorem, characterising the greatest fixpointof a monotone function over a complete lattice as the largest post-fixpoint, naturally leads to the so-called coinduction proof principle forshowing that some element is below the greatest fixpoint (e.g., for provid-ing bisimilarity witnesses). The dual principle, used for showing that anelement is above the least fixpoint, is related to inductive invariants. Inthis paper we provide proof rules which are similar in spirit but for show-ing that an element is above the greatest fixpoint or, dually, below theleast fixpoint. The theory is developed for non-expansive monotone func-tions on suitable lattices of the form M Y , where Y is a finite set and M an MV-algebra, and it is based on the construction of (finitary) approx-imations of the original functions. We show that our theory applies to awide range of examples, including termination probabilities, behaviouraldistances for probabilistic automata and bisimilarity. Moreover it allowsus to determine original algorithms for solving simple stochastic games. Fixpoints are ubiquitous in computer science as they allow to provide a meaningto inductive and coinductive definitions (see, e.g., [25,22]). A monotone function f : L → L over a complete lattice ( L, ⊑ ), by Knaster-Tarski’s theorem [27],admits a least fixpoint µf and greatest fixpoint νf which are characterised as theleast pre-fixpoint and the greatest post-fixpoint, respectively. This immediatelygives well-known proof principles for showing that a lattice element l ∈ L is below νf or above µf l ⊑ f ( l ) l ⊑ νf f ( l ) ⊑ lµf ⊑ l On the other hand, showing that a given element l is above νf or below µf is more difficult. One can think of using the characterisation of least and largestfixpoints via Kleene’s iteration. E.g., the largest fixpoint is the least elementof the (possibly transfinite) descending chain obtained by iterating f from ⊤ .Then showing that f i ( ⊤ ) ⊑ l for some i , one concludes that νf ⊑ l . This proofprinciple is related to the notion of ranking functions. However, this is a lesssatisfying notion of witness since f has to be applied i times, and this can beinefficient or unfeasible when i is an infinite ordinal.The aim of this paper is to present an alternative proof rule for this purposefor functions over lattices of the form L = M Y where Y is a finite set and M s an MV-chain, i.e., a totally ordered complete lattice endowed with suitableoperations of sum and complement. This allows us to capture several exam-ples, ranging from ordinary relations, for dealing with bisimilarity, behaviouralmetrics, termination probabilities and simple stochastic games.Assume f : M Y → M Y monotone and consider the question of proving thatsome fixpoint a : Y → M is the largest fixpoint νf . The idea is to show thatthere is no “slack” or “wiggle room” in the fixpoint a that would allow us tofurther increase it. This is done by associating with every a : Y → M a function f a on Y whose greatest fixpoint gives us the elements of Y where we havea potential for increasing a by adding a constant. If no such potential exists,i.e. νf a is empty, we conclude that a is νf . A similar function f a (specifyingdecrease instead of increase) exists for the case of least fixpoints. Note that thepremise is νf a = ∅ , i.e. the witness remains coinductive. The proof rules are: f ( a ) = a νf a = ∅ νf = a f ( a ) = a νf a = ∅ µf = a For applying the rule we compute a greatest fixpoint on Y , which is finite,instead of working on the potentially infinite M Y . The rule does not work forall monotone functions f : M Y → M Y , but we show that whenever f is non-expansive the rule is valid. Actually, it is not only sound, but also reversible, i.e.,if a = νf then νf a = ∅ , providing an if-and-only-if characterisation of whethera given fixpoint corresponds to the greatest fixpoint.Quite interestingly, under the same assumptions on f , using a restrictedfunction f ∗ a , the rule can be used, more generally, when a is just a pre-fixpoint ( f ( a ) ⊑ a ) and it allows to conclude that νf ⊑ a . A dual result holds for post-fixpoints in the case of least fixpoints. f ( a ) ⊑ a νf ∗ a = ∅ νf ⊑ a a ⊑ f ( a ) νf a ∗ = ∅ a ⊑ µf As already mentioned, the theory above applies to many interesting scenarios:witnesses for non-bisimilarity, algorithms for simple stochastic games [10] andlower bounds for termination probabilities and behavioural metrics in the settingof probabilistic systems [1] and probabilistic automata [2]. In particular we wereinspired by, and generalise, the self-closed relations of Fu [15], also used in [2].
Motivating example.
Consider a Markov chain (
S, T, η ) with a finite set of states S , where T ⊆ S are the terminal states and every state s ∈ S \ T is associatedwith a probability distribution η ( s ) ∈ D ( S ). Intuitively, η ( s )( s ′ ) denotes theprobability of state s choosing s ′ as its successor. Assume that, given a fixedstate s ∈ S , we want to determine the termination probability of s , i.e. theprobability of reaching any terminal state from s . As a concrete example, takethe Markov chain given in Fig. 1, where u is the only terminal state. D ( S ) is the set of all maps p : S → [0 ,
1] such that P s ∈ S p ( s ) = 1. : [0 , S → [0 , S T ( t )( s ) = ( v ∈ T P s ′ ∈ S η ( s )( s ′ ) · t ( s ′ ) otherwise x /1 u y z
13 1313 Fig. 1: Function T (left) and a Markov chain with two fixpoints of T (right)The termination probability arises as the least fixpoint of a function T definedas in Fig. 1. The values of µ T are indicated in green (left value).Now consider the function t assigning to each state the termination probabil-ity written in red (right value). It is not difficult to see that t is another fixpointof T , in which states y and z convince each other incorrectly that they terminatewith probability 1, resulting in a vicious cycle that gives “wrong” results. Wewant to show that µ T 6 = t without knowing µ T . Our idea is to compute the setof states that still has some “wiggle room”, i.e., those states which could reducetheir termination probability by δ if all their successors did the same. This def-inition has a coinductive flavour and it can be computed as a greatest fixpointon the finite powerset S of states, instead of on the infinite lattice S [0 , .We hence consider a function T t : [ S ] t → [ S ] t , dependent on t , defined asfollows. Let [ S ] t be the set of all states s where t ( s ) >
0, i.e., a reduction is inprinciple possible. Then a state s ∈ [ S ] t is in T t ( S ′ ) iff s T and for all s ′ forwhich η ( s )( s ′ ) > s ′ ∈ S ′ , i.e. all successors of s are in S ′ .The greatest fixpoint of T t is { y, z } . The fact that it is not empty means thatthere is some “wiggle room”, i.e., the value of t can be reduced on the elements { y, z } and thus t cannot be the least fixpoint of f . Moreover, the intuition that t can be improved on { y, z } can be made precise, leading to the possibility ofperforming the improvement and search for the least fixpoint from there. Contributions.
In the paper we formalise the theory outlined above, showingthat the proof rules work for non-expansive monotone functions f on lattices ofthe form M Y , where Y is a finite set and M a (potentially infinite) MV-algebra( § § f we show how to obtainthe corresponding approximation compositionally ( § §
6) and simplestochastic games ( § In this section, we review some basic notions used in the paper, concerningcomplete lattices and MV-algebras [20].3 preordered or partially ordered set ( P, ⊑ ) is often denoted simply as P ,omitting the order relation. Given x, y ∈ P , with x ⊑ y , we denote by [ x, y ] theinterval { z ∈ P | x ⊑ z ⊑ y } . The join and the meet of a subset X ⊆ P (if theyexist) are denoted F X and d X , respectively.A complete lattice is a partially ordered set ( L, ⊑ ) such that each subset X ⊆ L admits a join F X and a meet d X . A complete lattice ( L, ⊑ ) always hasa least element ⊥ = F ∅ and a greatest element ⊤ = d ∅ .A function f : L → L is monotone if for all l, l ′ ∈ L , if l ⊑ l ′ then f ( l ) ⊑ f ( l ′ ). By Knaster-Tarski’s theorem [27, Thm. 1], any monotone function on acomplete lattice has a least and a greatest fixpoint, denoted respectively µf and νf , characterised as the meet of all pre-fixpoints respectively the join of allpost-fixpoints: µf = d { l | f ( l ) ⊑ l } and νf = F { l | l ⊑ f ( l ) } .Let ( C, ⊑ ), ( A, ≤ ) be complete lattices. A Galois connection is a pair ofmonotone functions h α, γ i such that α : C → A , γ : A → C and for all a ∈ A and c ∈ C : α ( c ) ≤ a iff c ⊑ γ ( a ).Equivalently, for all a ∈ A and c ∈ C , (i) c ⊑ γ ( α ( c )) and (ii) α ( γ ( a )) ≤ a . Inthis case we will write h α, γ i : C → A . For a Galois connection h α, γ i : C → A ,the function α is called the left (or lower) adjoint and γ the right (or upper)adjoint.Galois connections are at the heart of abstract interpretation [12,13]. In par-ticular, when h α, γ i is a Galois connection, given f C : C → C and f A : A → A ,monotone functions, if f C ◦ γ ⊑ γ ◦ f A , then νf C ⊑ γ ( νf A ). If equality holds,i.e., f C ◦ γ = γ ◦ f A , then greatest fixpoints are preserved along the connection,i.e., νf C = γ ( νf A ).Given a set Y and a complete lattice L , the set of functions L Y = { f | f : Y → L } , endowed with pointwise order, i.e., for a, b ∈ L Y , a ⊑ b if a ( y ) ⊑ b ( y )for all y ∈ Y , is a complete lattice.In the paper we will mostly work with lattices of the kind M Y where M is aspecial kind of lattice with a rich algebraic structure, i.e. an MV-algebra [20]. Definition 2.1 (MV-algebra). An MV-algebra is a tuple M = ( M, ⊕ , , ( · )) where ( M, ⊕ , is a commutative monoid and ( · ) : M → M maps each elementto its complement , such that for all x, y ∈ M x = x x ⊕ ( x ⊕ y ) ⊕ y = ( y ⊕ x ) ⊕ x .We denote , multiplication x ⊗ y = x ⊕ y and subtraction x ⊖ y = x ⊗ y . MV-algebras are endowed with a natural order.
Definition 2.2 (natural order).
Let M = ( M, ⊕ , , ( · )) be an MV-algebra.The natural order on M is defined, for x, y ∈ M , by x ⊑ y if x ⊕ z = y for some z ∈ M . When ⊑ is total M is called an MV-chain . ⊥ = 0, ⊤ = 1, x ⊔ y = ( x ⊖ y ) ⊕ y and x ⊓ y = x ⊔ y = x ⊗ ( x ⊕ y ). We call theMV-algebra complete , if it is a complete lattice, which is not true in general,e.g., ([0 , ∩ Q , ≤ ). Example 2.3.
A prototypical example of an MV-algebra is ([0 , , ⊕ , , ( · )) where x ⊕ y = min { x + y, } and x = 1 − x for x, y ∈ [0 , . This means that x ⊗ y = max { x + y − , } and x ⊖ y = max { , x − y } (truncated subtraction).The operators ⊕ and ⊗ are also known as strong disjunction and conjunction in Lukasiewicz logic [21]. The natural order is ≤ (less or equal) on the reals.Another example is ( { , . . . , k } , ⊕ , , ( · )) where n ⊕ m = min { n + m, k } and n = k − n for n, m ∈ { , . . . , k } . We are in particular interested in the case k = 1 . Both MV-algebras are complete and MV-chains.Boolean algebras (with disjunction and complement) also form MV-algebrasthat are complete, but in general not MV-chains. MV-algebras are the algebraic semantics of Lukasiewicz logic. They can beshown to correspond to intervals of the kind [0 , u ] in suitable groups, i.e., abelianlattice-ordered groups with a strong unit u [20]. As mentioned in the introduction, our interest is for fixpoints of monotone func-tions f : M Y → M Y , where M is an MV-chain and Y is a finite set. We willsee that for non-expansive functions we can over-approximate the sets of pointsin which a given a ∈ M Y can be increased in a way that is preserved by theapplication of f . This will be the core of the proof rules outlined earlier. Non-expansive functions on MV-algebras.
For defining non-expansiveness it isconvenient to introduce a norm.
Definition 3.1 (norm).
Let M be an MV-chain and let Y be a finite set. Given a ∈ M Y we define its norm as || a || = max { a ( y ) | y ∈ Y } . Given a finite set Y we extend ⊕ and ⊗ to M Y pointwise. E.g. if a, b ∈ M Y ,we write a ⊕ b for the function defined by ( a ⊕ b )( y ) = a ( y ) ⊕ b ( y ) for all y ∈ Y .Given Y ′ ⊆ Y and δ ∈ M , we write δ Y ′ for the function defined by δ Y ′ ( y ) = δ if y ∈ Y ′ and δ Y ′ ( y ) = 0, otherwise. Whenever this does not generate confusion,we write δ instead of δ Y . It can be seen that ||·|| has the properties of a norm,i.e., for all a, b ∈ M Y and δ ∈ M , it holds that (1) || a ⊕ b || ⊑ || a || ⊕ || b || , (2) || δ ⊗ a || = δ ⊗ || a || and and || a || = 0 implies that a is the constant 0 (see Lem. B.1in the appendix). Moreover, it is clearly monotonic, i.e., if a ⊑ b then || a || ⊑ || b || .We next introduce non-expansiveness. Despite the fact that we will finally beinterested in endo-functions f : M Y → M Y , in order to allow for a compositionalreasoning we work with functions where domain and codomain can be different.5 efinition 3.2 (non-expansiveness). Let f : M Y → M Z be a function, where M is an MV-chain and Y, Z are finite sets. We say that it is non-expansive iffor all a, b ∈ M Y it holds || f ( b ) ⊖ f ( a ) || ⊑ || b ⊖ a || . Note that ( a, b )
7→ || a ⊖ b || is the supremum lifting of a directed version ofChang’s distance [20]. It is easy to see that all non-expansive functions on MV-chains are monotone and in M = { , } such notions coincide. Approximating the propagation of increases.
Let f : M Y → M Z be a monotonefunction and take a, b ∈ M Y with a ⊑ b . We are interested in the difference b ( y ) ⊖ a ( y ) for some y ∈ Y and on how the application of f “propagates” thisincrease. The reason is that, understanding that no increase can be propagatedwill be crucial to establish when a fixpoint of a non-expansive function f isactually the largest one, and, more generally, when a (pre-)fixpoint of f is abovethe largest fixpoint.In order to formalise the above intuition, we rely on tools from abstract inter-pretation. In particular, the following pair of functions, which, under a suitablecondition, form a Galois connection, will play a major role. The left adjoint α a,δ takes as input a set Y ′ and, for y ∈ Y ′ , it increases the values a ( y ) by δ , whilethe right adjoint γ a,δ takes as input a function b ∈ M Y , b ∈ [ a, a ⊕ δ ] and checksfor which parameters y ∈ Y the value b ( y ) exceeds a ( y ) by δ .We also define [ Y ] a , the subset of elements in Y where a ( y ) is not 1 and thusthere is a potential to increase, and δ a , which gives us the minimal such increase. Definition 3.3 (functions to sets, and vice versa).
Let M be an MV-algebraand let Y be a finite set. Define the set [ Y ] a = { y ∈ Y | a ( y ) = 1 } (support of a )and δ a = min { a ( y ) | y ∈ [ Y ] a } with min ∅ = 1 .For ⊏ δ ∈ M we consider the functions α a,δ : [ Y ] a → [ a, a ⊕ δ ] and γ a,δ : [ a, a ⊕ δ ] → [ Y ] a , defined, for Y ′ ∈ [ Y ] a and b ∈ [ a, a ⊕ δ ] , by α a,δ ( Y ′ ) = a ⊕ δ Y ′ γ a,δ ( b ) = { y ∈ [ Y ] a | b ( y ) ⊖ a ( y ) ⊒ δ } . When δ is sufficiently small, the pair h α a,δ , γ a,δ i is a Galois connection. [ Y ] a [ a, a ⊕ δ ] α a,δ γ a,δ Lemma 3.4 (Galois connection).
Let M be anMV-algebra and Y be a finite set. For = δ ⊑ δ a ,the pair h α a,δ , γ a,δ i : [ Y ] a → [ a, a ⊕ δ ] is a Galoisconnection. Whenever f is non-expansive, it is easy to see that it restricts to a function f : [ a, a ⊕ δ ] → [ f ( a ) , f ( a ) ⊕ δ ] for all δ ∈ M .As mentioned before, a crucial result shows that for all non-expansive func-tions, under the assumption that Y, Z are finite and the order on M is total,we can suitably approximate the propagation of increases. In order to state thisresult, a useful tool is a notion of approximation of a function. Definition 3.5 ( ( δ, a ) -approximation). Let M be an MV-chain, let Y , Z befinite sets and let f : M Y → M Z be a non-expansive function. For a ∈ M Y andany δ ∈ M we define f a,δ : [ Y ] a → [ Z ] f ( a ) as f a,δ = γ f ( a ) ,δ ◦ f ◦ α a,δ . Y ′ ⊆ [ Y ] a , its image f a,δ ( Y ′ ) ⊆ [ Z ] f ( a ) is the set of points z ∈ [ Z ] f ( a ) such that δ ⊑ f ( a ⊕ δ Y ′ )( z ) ⊖ f ( a )( z ), i.e., the points to which f propagates anincrease of the function a with value δ on the subset Y ′ .We first show that f a,δ is antitone in the parameter δ , a non-trivial result. Lemma 3.6 (anti-monotonicity).
Let M be a MV-chain, let Y , Z be finitesets, let f : M Y → M Z be a non-expansive function and let a ∈ M Y . For θ, δ ∈ M , if θ ⊑ δ then f a,δ ⊆ f a,θ . Since f a,δ increases when δ decreases and there are finitely many such func-tions, there must be a value ι fa such that all functions f a,δ for 0 ⊏ δ ⊑ ι fa areequal. This function is denoted by f a and is called the a -approximation of f .We next show that indeed, for all non-expansive functions, the a -approximationproperly approximates the propagation of increases. Theorem 3.7 (approximation of non-expansive functions).
Let M bea complete MV-chain, let Y, Z be finite sets and let f : M Y → M Z be a non-expansive function. Then there exists ι fa ∈ M , the largest value below or equal to δ a such that f a,δ = f a,δ ′ for all ⊏ δ, δ ′ ⊑ ι fa .We denote this function by f a and call it the a -approximation of f . Then for all ⊏ δ ∈ M :a. γ f ( a ) ,δ ◦ f ⊆ f a ◦ γ a,δ b. for δ ⊑ δ a : δ ⊑ ι fa iff γ f ( a ) ,δ ◦ f = f a ◦ γ a,δ [ a, a ⊕ δ ] f (cid:15) (cid:15) γ a,δ / / ⊑ [ Y ] a f a (cid:15) (cid:15) [ f ( a ) , f ( a ) ⊕ δ ] γ f ( a ) ,δ / / [ Z ] f ( a ) Note that if Y = Z and a is a fixpoint of f , i.e., a = f ( a ), condition (a) abovecorresponds exactly to soundness in the sense of abstract interpretation [12],while condition (b) corresponds to ( γ -)completeness (see also § In this section we formalise the proof technique outlined in the introduction forshowing that a fixpoint is the largest and, more generally, for checking over-approximations of greatest fixpoints of non-expansive functions.Consider a monotone function f : M Y → M Y for some finite set Y . We firstfocus on the problem of establishing whether some given fixpoint a of f coincideswith νf (without explicitly knowing νf ), and, in case it does not, finding an“improvement”, i.e., a post-fixpoint of f , larger than a . Observe that when a is a fixpoint, [ Y ] a = [ Y ] f ( a ) and thus the a -approximation of f (Thm. 3.7) is anendofunction f a : [ Y ] a → [ Y ] a . We have the following result, which relies on thefact that due to Thm. 3.7 γ a,δ preserves fixpoints (of f and f a ). Theorem 4.1 (soundness and completeness for fixpoints).
Let M bea complete MV-chain, Y a finite set and f : M Y → M Y be a non-expansivefunction. Let a ∈ M Y be a fixpoint of f . Then νf a = ∅ if and only if a = νf . a is a fixpoint, but not yet the largest fixpoint of f , we can increaseit and obtain a post-fixpoint. Lemma 4.2.
Let M be a complete MV-chain, f : M Y → M Y a non-expansivefunction, a ∈ M a fixpoint of f , and let f a be the corresponding a -approximationand ι fa as in Thm. 3.7. Then α a,ι fa ( νf a ) = a ⊕ ( ι fa ) νf a is a post-fixpoint of f . Using these results one can perform an alternative fixpoint iteration where weiterate to the largest fixpoint from below: start with a post-fixpoint a ⊑ f ( a )(which is clearly below νf ) and obtain, by (possibly transfinite) iteration, anascending chain that converges to a , the least fixpoint above a . Now checkwith Thm. 4.1 whether Y ′ = νf a = ∅ . If yes, we have reached νf = a . If not, α a,ι fa ( Y ′ ) = a ⊕ ( ι fa ) Y ′ is again a post-fixpoint (cf. Lem. 4.2) and we continuethis procedure until – for some ordinal – we reach the largest fixpoint νf , forwhich we have νf νf = ∅ .Interestingly, the soundness result in Thm. 4.1 can be generalised to the casein which a is a pre-fixpoint instead of a fixpoint. In this case, the a -approximationfor a function f : M Y → M Y is a function f a : [ Y ] a → [ Y ] f ( a ) where domain andcodomain are different, hence it would not be meaningful to look for fixpoints.However, as explained below, it can be restricted to an endofunction. Theorem 4.3 (soundness for pre-fixpoints).
Let M be a complete MV-chain, Y a finite set and f : M Y → M Y be a non-expansive function. Givena pre-fixpoint a ∈ M Y of f , let [ Y ] a = f ( a ) = { y ∈ [ Y ] a | a ( y ) = f ( a )( y ) } . Letus define f ∗ a : [ Y ] a = f ( a ) → [ Y ] a = f ( a ) as f ∗ a ( Y ′ ) = f a ( Y ′ ) ∩ [ Y ] a = f ( a ) , where f a : [ Y ] a → [ Y ] f ( a ) is the a -approximation of f . If νf ∗ a = ∅ then νf ⊑ a . Roughly, the intuition for the above result is the following: the value of f ( a )on some y might or might not depend “circularly” on the value of a on y itself.In a purely inductive setting, without such circular dependencies, µf = νf andhence a being a pre-fixpoint means that we over-approximate νf . However, wemight have vicious cycles, as explained in the introduction, that destroy theover-approximation since the values are too low. Now, since we restrict to non-expansive functions, it must be the case that there is a cycle, such that allelements on this cycle are points where a and f ( a ) coincide. It is hence sufficientto check whether a given pre-fixpoint could be increased on its subpart whichcorresponds to a fixpoint, i.e., the idea is to restrict to [ Y ] a = f ( a ) . We detect suchsituations by looking for “wiggle room” as for fixpoints.Completeness does not generalise to pre-fixpoints, i.e., it is not true that if a is a pre-fixpoint of f and νf ⊑ a then νf ∗ a = ∅ . A pre-fixpoint might contain slackeven though it is above the greatest fixpoint. A counterexample is in Ex. 6.11. The dual view for least fixpoints.
The theory developed so far can be easilydualised to check under-approximations of least fixpoints. Given a complete MV-algebra M = ( M, ⊕ , , ( · )) and a monotone function f : M Y → M Y , in order toshow that a post-fixpoint a ∈ M Y satisfies a ⊑ µf , we can in fact simply workin the dual MV-algebra, M op = ( M, ⊒ , ⊗ , ( · ) , ⊖ and the original order.8 [ Y ] a [ a ⊖ θ, a ] α a,θ γ a,θ We next outline the dualised setting (for details onhow it arises see Appendix C.1). The notation for thedual case is obtained from that of the original (primal)case, exchanging subscripts and superscripts.Given a ∈ M Y , define [ Y ] a = { y ∈ Y | a ( y ) = 0 } and δ a = min { a ( y ) | y ∈ [ Y ] a } . For θ ∈ M , we consider the pair of functions h α a,θ , γ a,θ i : [ Y ] a → [ a ⊖ θ, a ]where, for Y ′ ∈ [ Y ] a , we let α a,θ ( Y ′ ) = a ⊖ θ Y ′ and, for b ∈ [ a ⊖ θ, a ], we let γ a,θ ( b ) = { y ∈ Y | a ( y ) ⊖ b ( y ) ⊒ θ } .A function f : M Y → M Z is non-expansive in the dual MV-algebra when itis in the primal one. Its approximation in the sense of Thm. 3.7 is denoted f a .Then the dualisations of Thm. 4.1 and 4.3 hold, i.e., if a is a fixpoint of f , then νf a = ∅ iff µf = a , and whenever a is a post-fixpoint, νf a ∗ = ∅ implies a ⊑ µf . Given a non-expansive function f and a (pre/post-)fixpoint a , it is often non-trivial to determine the corresponding approximations. However, non-expansivefunctions enjoy good closure properties (closure under composition, and closureunder disjoint union) and we will see that the same holds for the correspondingapproximations. Furthermore it turns out that the functions needed in the ap-plications can be obtained from just a few templates. This gives us a toolbox forassembling approximations with relative ease. Theorem 5.1.
All basic functions listed in Table 1 are non-expansive. Further-more non-expansive functions are closed under composition and disjoint union.The approximations are the ones listed in the third column of the table.
We start by making the example from the introduction ( §
1) more formal. Con-sider a Markov chain (
S, T, η ), as defined in the introduction (Fig. 1), where werestrict the codomain of η : S \ T → D ( S ) to D ⊆ D ( S ), where D is finite (toensure that all involved sets are finite). Furthermore let T : [0 , S → [0 , S bethe function from the introduction whose least fixpoint µ T assigns to each stateits termination probability. Lemma 6.1.
The function T can be written as T = ( η ∗ ◦ av D ) ⊎ c k where k : T → [0 , is the constant function defined only on terminal states. From this representation and Thm. 5.1 it is obvious that T is non-expansive.9able 1: Basic functions f : M Y → M Z (constant, reindexing, minimum, maxi-mum, average), function composition, disjoint union and the corresponding ap-proximations f a : [ Y ] a → [ Z ] f ( a ) , f a : [ Y ] a → [ Z ] f ( a ) . Notation: R − ( z ) = { y ∈ Y | y R z } , supp ( p ) = { y ∈ Y | p ( y ) > } for p ∈ D ( Y ), Min a = { y ∈ Y | a ( y ) minimal } , Max a = { y ∈ Y | a ( y ) maximal } , a : Y → M function f definition of f f a ( Y ′ ) (above), f a ( Y ′ ) (below) c k f ( a ) = k ∅ ( k ∈ M Z ) ∅ u ∗ f ( a ) = a ◦ u u − ( Y ′ )( u : Z → Y ) u − ( Y ′ )min R f ( a )( z ) = min y R z a ( y ) { z ∈ [ Z ] f ( a ) | Min a | R− z ) ⊆ Y ′ } ( R ⊆ Y × Z ) { z ∈ [ Z ] f ( a ) | Min a | R− z ) ∩ Y ′ = ∅} max R f ( a )( z ) = max y R z a ( y ) { z ∈ [ Z ] f ( a ) | Max a | R− z ) ∩ Y ′ = ∅} ( R ⊆ Y × Z ) { z ∈ [ Z ] f ( a ) | Max a | R− z ) ⊆ Y ′ } av D ( M = [0 , f ( a )( p ) = P y ∈ Y p ( y ) · a ( y ) { p ∈ [ D ] f ( a ) | supp ( p ) ⊆ Y ′ } Z = D ⊆ D ( Y )) { p ∈ [ D ] f ( a ) | supp ( p ) ⊆ Y ′ } h ◦ g f ( a ) = h ( g ( a )) h g ( a ) ◦ g a ( Y ′ )( g : M Y → M W , h g ( a ) ◦ g a ( Y ′ ) h : M W → M Z ) U i ∈ I f i I finite f ( a )( z ) = f i ( a | Y i )( z ) U i ∈ I ( f i ) a | Yi ( Y ′ ∩ Y i )( f i : M Y i → M Z i , ( z ∈ Z i ) U i ∈ I ( f i ) a | Yi ( Y ′ ∩ Y i ) Y = S i ∈ I Y i , Z = U i ∈ I Z i ) Lemma 6.2.
Let t : S → [0 , . The approximation for T in the dual sense is T t : [ S ] t → [ S ] T ( t ) with T t ( S ′ ) = { s ∈ [ S ] T ( t ) | s / ∈ T ∧ supp ( η ( s )) ⊆ S ′ } . It is well-known that the function T can be tweaked in such a way that it hasa unique fixpoint, coinciding with µ T , by determining all states which cannotreach a terminal state and setting their value to zero [3]. Hence fixpoint iterationfrom above does not bring us any added value here. It does however make senseto use the proof rule in order to guarantee lower bounds via post-fixpoints.Furthermore, termination probability is a special case of the considerablymore complex stochastic games that will be studied in §
7, where the trick ofmodifying the function is not applicable.10 .2 Behavioural metrics for probabilistic automata
Before we start discussing probabilistic automata, we first consider the Hausdorffand the Kantorovich lifting and the corresponding approximations.
Hausdorff lifting.
Given a metric on a set X , the Hausdorff metric is obtainedby lifting the original metric to X . Here we define this for general distancefunctions on M , not restricting to metrics. In particular the Hausdorff lifting isgiven by a function H : M X × X → M X × X where H ( d )( X , X ) = max { max x ∈ X min x ∈ X d ( x , x ) , max x ∈ X min x ∈ X d ( x , x ) } . An alternative characterisation due to M´emoli [19], also in [4], is more convenientfor our purposes. If we let u : X × X → X × X with u ( C ) = ( π [ C ] , π [ C ]),where π , π are the projections π i : X × X → X and π i [ C ] = { π i ( c ) | c ∈ C } .Then H ( d )( X , X ) = min { max ( x ,x ) ∈ C d ( x , x ) | C ⊆ X × X ∧ u ( C ) =( X , X ) } . Relying on this, we can obtain the result below, from which we deducethat H is non-expansive and construct its approximation as the composition ofthe corresponding functions from Table 1. Lemma 6.3. H = min u ◦ max ∈ where max ∈ : M X × X → M X × X ( ∈ ⊆ ( X × X ) × X × X is the “is-element-of ”-relation on X × X ), min u : M X × X → M X × X .Kantorovich lifting. The Kantorovich (also known as Wasserstein) lifting con-verts a metric on X to a metric on probability distributions over X . As for theHausdorff lifting, we lift distance functions that are not necessarily metrics.Furthermore, in order to ensure finiteness of all the sets involved, we re-strict to D ⊆ D ( X ), some finite set of probability distributions over X . A coupling of p, q ∈ D is a probability distribution c ∈ D ( X × X ) whose leftand right marginals are p, q , i.e., p ( x ) = m Lc ( x ) := P x ∈ X c ( x , x ) and q ( x ) = m Rc ( x ) := P x ∈ X c ( x , x ). The set of all couplings of p, q , denotedby Ω ( p, q ), forms a polytope with finitely many vertices [23]. The set of all poly-tope vertices that are obtained by coupling any p, q ∈ D is also finite and isdenoted by VP D ⊆ D ( X × X ).The Kantorovich lifting is given by K : [0 , X × X → [0 , D × D where K ( d )( p, q ) = min c ∈ Ω ( p,q ) X ( x ,x ) ∈ X × X c ( x , x ) · d ( x , x ) . The coupling c can be interpreted as the optimal transport plan to move goodsfrom suppliers to customers [29]. Again there is an alternative characterisation,which shows non-expansiveness of K : Lemma 6.4.
Let u : VP D → D × D , u ( c ) = ( m Lc , m Rc ) . Then K = min u ◦ av VP D where av VP D : [0 , X × X → [0 , VP D , min u : [0 , VP D → [0 , D × D . robabilistic automata. We now compare our approach with [2], which describesthe first method for computing behavioural distances for probabilistic automata.Although the behavioural distance arises as a least fixpoint, it is in fact better,even the only known method, to iterate from above, in order to reach this leastfixpoint. This is done by guessing and improving couplings, similar to strategyiteration discussed later in §
7. A major complication, faced in [2], is that theprocedure can get stuck at a fixpoint which is not the least and one has todetermine that this is the case and decrease the current candidate. In fact thispaper was our inspiration to generalise this technique to a more general setting.A probabilistic automaton is a tuple A = ( S, L, η, ℓ ), where S is a non-emptyfinite set of states, L is a finite set of labels, η : S → D ( S ) assigns finite sets ofprobability distributions to states and ℓ : S → L is a labelling function. (In thefollowing we again replace D ( S ) by a finite subset D .)The probabilistic bisimilarity pseudometrics is the least fixpoint of the func-tion M : [0 , S × S → [0 , S × S where for d : S × S → [0 , s, t ∈ S : M ( d )( s, t ) = ( ℓ ( s ) = ℓ ( t ) H ( K ( d ))( η ( s ) , η ( t )) otherwisewhere H is the Hausdorff lifting (for M = [0 , K is the Kantorovich liftingdefined earlier. Now assume that d is a fixpoint of M , i.e., d = M ( d ). In orderto check whether d = µf , [2] adapts the notion of a self-closed relation from [15]. Definition 6.5 ([2]).
A relation M ⊆ S × S is self-closed wrt. d = M ( d ) if,whenever s M t , then – ℓ ( s ) = ℓ ( t ) and d ( s, t ) > , – if p ∈ η ( s ) and d ( s, t ) = min q ′ ∈ η ( t ) K ( d )( p, q ′ ) , then there exists q ∈ η ( t ) and c ∈ Ω ( p, q ) such that d ( s, t ) = P u,v ∈ S d ( u, v ) · c ( u, v ) and supp ( c ) ⊆ M , – if q ∈ η ( t ) and d ( s, t ) = min p ′ ∈ η ( s ) K ( d )( p ′ , q ) , then there exists p ∈ η ( s ) and c ∈ Ω ( p, q ) such that d ( s, t ) = P u,v ∈ S d ( u, v ) · c ( u, v ) and supp ( c ) ⊆ M . The largest self-closed relation, denoted by ≈ d is empty if and only if d = µf [2]. We now investigate the relation between self-closed relations and post-fixpoints of approximations. For this we will first show that M can be composedfrom non-expansive functions, which proves that it is indeed non-expansive. Fur-thermore, this decomposition will help in the comparison. Lemma 6.6.
The fixpoint function M characterizing probabilistic bisimilaritypseudometrics can be written as: M = max ρ ◦ ((( η × η ) ∗ ◦ H ◦ K ) ⊎ c l ) where ρ : ( S × S ) ⊎ ( S × S ) → ( S × S ) with ρ (( s, t ) , i ) = ( s, t ) . Furthermore l : S × S → [0 , is defined as l ( s, t ) = 0 if ℓ ( s ) = ℓ ( t ) and l ( s, t ) = 1 if ℓ ( s ) = ℓ ( t ) . Here we use i ∈ { , } as indices to distinguish the elements in the disjoint union. M is a composition of non-expansive functions and thus non-expansiveitself. We do not spell out M d explicitly, but instead show how it is related toself-closed relations. Proposition 6.7.
Let d : S × S → [0 , where d = M ( d ) . Then M d : [ S × S ] d → [ S × S ] d , where [ S × S ] d = { ( s, t ) ∈ S × S | d ( s, t ) > } .Then M is a self-closed relation wrt. d if and only if M ⊆ [ S × S ] d and M is a post-fixpoint of M d . In order to define standard bisimilarity we use a variant G of the Hausdorff lifting H from § G .Now we can define the fixpoint function for bisimilarity and its correspondingapproximation. For simplicity we consider unlabelled transition systems, but itwould be straightforward to handle labelled transitions.Let X be a finite set of states and η : X → X a function that assigns a setof successors η ( x ) to a state x ∈ X . For the fixpoint function for bisimilarity B : { , } X × X → { , } X × X we use the Hausdorff lifting G with M = { , } . Lemma 6.8.
Bisimilarity on η is the greatest fixpoint of B = ( η × η ) ∗ ◦ G . Since we are interested in the greatest fixpoint, we are working in the primalsense. Bisimulation relations are represented by their characteristic functions d : X × X → { , } , in fact the corresponding relation can be obtained by takingthe complement of [ X × X ] d = { ( x , x ) ∈ X × X | d ( x , x ) = 0 } . Lemma 6.9.
Let d : X × X → { , } . The approximation for the bisimilarityfunction B in the primal sense is B d : [ X × X ] d → [ X × X ] B ( d ) with B d ( R ) = { ( x , x ) ∈ [ X × X ] B ( d ) |∀ y ∈ η ( x ) ∃ y ∈ η ( x ) (cid:0) ( y , y ) [ X × X ] d ∨ ( y , y ) ∈ R ) (cid:1) ∧∀ y ∈ η ( x ) ∃ y ∈ η ( x ) (cid:0) ( y , y ) [ X × X ] d ∨ ( y , y ) ∈ R ) (cid:9) We conclude this section by discussing how this view on bisimilarity canbe useful: first, it again opens up the possibility to compute bisimilarity – agreatest fixpoint – by iterating from below, through smaller fixpoints. This couldpotentially be useful if it is easy to compute the least fixpoint of B inductivelyand continue from there.Furthermore, we obtain a technique for witnessing non-bisimilarity of states.While this can also be done by exhibiting a distinguishing modal formula [16,8]or by a winning strategy for the spoiler in the bisimulation game [26], to ourknowledge there is no known method that does this directly, based on the defi-nition of bisimilarity.With our technique however, we can witness non-bisimilarity of two states x , x ∈ X by presenting a pre-fixpoint d (i.e., B ( d ) ≤ d ) such that d ( x , x ) = 013equivalent to ( x , x ) ∈ [ X × X ] d ) and ν B d = ∅ , since this implies ν B ( x , x ) ≤ d ( x , x ) = 0 by our proof rule.There are two issues to discuss: first, how can we characterise a pre-fixpointof B (which is quite unusual, since bisimulations are post-fixpoints)? In fact, thecondition B ( d ) ≤ d can be rewritten to: for all ( x , x ) ∈ [ X × X ] d there exists y ∈ η ( x ) such that for all y ∈ η ( x ) we have ( y , y ) ∈ [ X × X ] d ( or viceversa). Second, at first sight it does not seem as if we gained anything since westill have to do a fixpoint computation on relations. However, the carrier set is[ X × X ] d , i.e., a set of non-bisimilarity witnesses and this set can be small eventhough X might be large. Example 6.10.
We consider the transition system depicted below.Our aim is to construct a witness showing that x, u are not bisimilar. This witness is a function d : X × X → { , } with d ( x, u ) = 0 = d ( y, u ) and for all other pairs the value is . x y u Hence [ X × X ] d = B ( d ) = [ X × X ] d = { ( x, u ) , ( y, u ) } and it is easy to checkthat d is a pre-fixpoint of B and that ν B ∗ d = ∅ : we iterate over { ( x, u ) , ( y, u ) } and first remove ( y, u ) (since y has no successors) and then ( x, u ) . This impliesthat ν B ≤ d and hence ν B ( x, u ) = 0 , which means that x, u are not bisimilar. Example 6.11.
We modify Ex. 6.10 and consider a function d where d ( x, u ) =0 and all other values are . Again d is a pre-fixpoint of B and ν B ≤ d (sinceonly reflexive pairs are in the bisimilarity). However ν B ∗ d = ∅ , since { ( x, u ) } is apost-fixpoint. This is a counterexample to completeness discussed after Thm. 4.3.Intuively speaking, the states y, u over-approximate and claim that they arebisimilar, although they are not. (This is permissible for a pre-fixpoint.) Thistricks x, u into thinking that there is some wiggle room and that one can increasethe value of ( x, u ) . This is true, but only because of the limited, local view, sincethe “true” value of ( y, u ) is . Introduction to simple stochastic games.
In this section we show how our tech-niques can be applied to simple stochastic games [10,9]. A simple stochastic gameis a state-based two-player game where the two players, Min and Max, each owna subset of states they control, for which they can choose the successor. The sys-tem also contains sink states with an assigned payoff and averaging states whichrandomly choose their successor based on a given probability distribution. Thegoal of Min is to minimise and the goal of Max to maximise the payoff.Simple stochastic games are an important type of games that subsume paritygames and the computation of behavioural distances for probabilistic automata(cf. § ?) is known to lie in NP ∩ coNP , but it is an open question whether it is contained in P . There areknown randomised subexponential algorithms [6].14t has been shown that it is sufficient to consider positional strategies, i.e.,strategies where the choice of the player is only dependent on the current state.The expected payoffs for each state form a so-called value vector and can beobtained as the least solution of a fixpoint equation (see below).A simple stochastic game is given by a finite set V of nodes, partitioned into MIN , MAX , AV (average) and SINK , and the following data: η min : MIN → V , η max : MAX → V (successor functions for Min and Max nodes), η av : AV → D (probability distributions, where D ⊆ D ( V ) finite) and w : SINK → [0 , V : [0 , V → [0 , V is defined below for a : V → [0 , v ∈ V : V ( a )( v ) = min v ′ ∈ η min ( v ) a ( v ′ ) v ∈ MIN max v ′ ∈ η max ( v ) a ( v ′ ) v ∈ MAX P v ′ ∈ V η av ( v )( v ′ ) · a ( v ′ ) v ∈ AV w ( v ) v ∈ SINK
The least fixpoint of V specifies the average payoff for all nodes when Min andMax play optimally. In an infinite game the payoff is 0. In order to avoid infinitegames and guarantee uniqueness of the fixpoint, many authors [17,9,28] restrictto stopping games, which are guaranteed to terminate for every pair of Min/Max-strategies. Here we deal with general games where more than one fixpoint mayexist. Such a scenario has been studied in [18], which considers value iterationto under- and over-approximate the value vector. The over-approximation faceschallenges with cyclic dependencies, similar to the vicious cycles described ear-lier. Here we focus on strategy iteration, which is usually less efficient than valueiteration, but yields a precise result instead of approximating it. Example 7.1.
We consider the game depicted below. Here min is a Min nodewith η min (min) = { , av } , max is a Max node with η max (max) = { ε , av } , is asink node with payoff 1, ε is a sink node with some small payoff ε ∈ (0 , and av is an average node which transitions to both min and max with probability .Min should choose av as successor since a payoff of is bad for Min. Giventhis choice of Min, Max should not declare av as successor since this would createan infinite play and hence the payoff is . Therefore Max has to choose ε and becontent with a payoff of ε , which is achieved from all nodes different from . min av ε max In order to be able to determine the approximation of V and to apply ourtechniques, we consider the following equivalent definition. Lemma 7.2. V = ( η ∗ min ◦ min ∈ ) ⊎ ( η ∗ max ◦ max ∈ ) ⊎ ( η ∗ av ◦ av D ) ⊎ c w , where ∈ ⊆ V × V is the “is-element-of ”-relation on V .
15s a composition of non-expansive functions, V is non-expansive as well. Sincewe are interested in the least fixpoint we work in the dual sense and obtain thefollowing approximation, which intuitively says: we can decrease a value at node v by a constant only if, in the case of a Min node, we decrease the value of onesuccessor where the minimum is reached, in the case of a Max node, we decreasethe values of all successors where the maximum is reached, and in the case of anaverage node, we decrease the values of all successors. Lemma 7.3.
Let a : V → [0 , . The approximation for the value iteration func-tion V in the dual sense is V a : [ V ] a → [ V ] V ( a ) with V a ( V ′ ) = { v ∈ [ V ] V ( a ) | (cid:0) v ∈ MIN ∧ Min a | η min( v ) ∩ V ′ = ∅ (cid:1) ∨ (cid:0) v ∈ MAX ∧ Max a | η max( v ) ⊆ V ′ (cid:1) ∨ (cid:0) v ∈ AV ∧ supp ( η av ( v )) ⊆ V ′ (cid:1) } Strategy iteration from above and below.