Computing Probabilistic Bisimilarity Distances for Probabilistic Automata
Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare, Qiyi Tang, Franck van Breugel
CCOMPUTING PROBABILISTIC BISIMILARITY DISTANCESFOR PROBABILISTIC AUTOMATA
GIORGIO BACCI, GIOVANNI BACCI, KIM G. LARSEN, RADU MARDARE, QIYI TANG,AND FRANCK VAN BREUGELDept. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer and Information Sciences, University of Strathclyde, Glasgow, UK e-mail address : [email protected]. of Computer Science, Oxford University, UK e-mail address : [email protected] Group, Dept. of Electrical Engineering and Computer Science, York University, Canada e-mail address : [email protected]
Abstract.
The probabilistic bisimilarity distance of Deng et al. has been proposed asa robust quantitative generalization of Segala and Lynch’s probabilistic bisimilarity forprobabilistic automata. In this paper, we present a characterization of the bisimilaritydistance as the solution of a simple stochastic game. The characterization gives us analgorithm to compute the distances by applying Condon’s simple policy iteration on thesegames. The correctness of Condon’s approach, however, relies on the assumption that thegames are stopping . Our games may be non-stopping in general, yet we are able to provetermination for this extended class of games. Already other algorithms have been proposedin the literature to compute these distances, with complexity in UP ∩ coUP and PPAD .Despite the theoretical relevance, these algorithms are inefficient in practice. To the bestof our knowledge, our algorithm is the first practical solution.The characterization of the probabilistic bisimilarity distance mentioned above cruciallyuses a dual presentation of the Hausdorff distance due to M´emoli. As an additionalcontribution, in this paper we show that M´emoli’s result can be used also to prove that thebisimilarity distance bounds the difference in the maximal (or minimal) probability of twostates to satisfying arbitrary ω -regular properties, expressed, eg., as LTL formulas. Key words and phrases: probabilistic automata, behavioural pseudometrics, stochastic games. ∗ This paper is an extended version of an earlier conference paper [BBL +
19] presented at CONCUR 2019.Giovanni Bacci and Kim G. Larsen are supported by the ERC-Project LASSO.Franck van Breugel is supported by Natural Sciences and Engineering Research Council of Canada.
Preprint submitted toLogical Methods in Computer Science c (cid:13)
G. Bacci, G. Bacci, K. G. Larsen, R. Mardare, Q. Tang, and F. van Breugel CC (cid:13) Creative Commons a r X i v : . [ c s . F L ] M a y G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL Introduction
In [GJS90], Giacalone et al. observed that for reasoning about the behaviour of probabilisticsystems, rather than equivalences, a notion of distance is more reasonable in practice since itpermits to capture the degree of difference between two states. This observation motivatedthe study of behavioural pseudometrics, that generalize behavioural equivalences in thesense that, when the distance is zero then the two states are behaviourally equivalent.The systems we consider in this paper are labelled probabilistic automata . This modelwas introduced by Segala [Seg95] to capture both nondeterminism (hence, concurrency) andprobabilistic behaviours. The labels on states are used to express that certain properties ofinterest hold in particular states.In Figure 1 we consider an example of a probabilistic automaton describing two gamblers, f and b , deciding on which team to bet in a football match. Typically the two gamblersknow on which team to bet, but occasionally they prefer to toss a coin to make a decision.This is represented by the three probabilistic transitions in the state f . The first two take f to state h (head) or t (tail) with probability one, the last takes f to states h and t withprobability each. The difference between f and b is that the former uses a fair coin whilethe latter uses a biased coin landing on heads with slightly higher probability. Once thedecision is taken, it is not changed anymore. This is seen on states h and t which have asingle probabilistic transition taking the state to itself with probability one. The states h and t have distinct labels, here represented by colours.A behavioural pseudometric for probabilistic automata capturing this difference isthe probabilistic bisimilarity distance by Deng et al. [DCPP06], introduced as a robustgeneralization of Segala and Lynch’s probabilistic bisimilarity [SL94]. The key ingredientsof this pseudometric are the Hausdorff metric [Hau14] and the Kantorovich metric [Kan42],respectively used to capture nondeterministic and probabilistic behaviour. In the exampleabove, the behaviours of the states h and t are very different since their labels are different.As a result, their probabilistic bisimilarity distance is one. On the other hand, the behavioursof the states f and b are very similar, which is reflected by the fact that their probabilisticbisimilarity distance is .The first attempt to compute the above distance is due to Chen et al. [CHL07], whoproposed a doubly exponential-time procedure to approximate the distances up to anydegree of accuracy. The complexity was later improved to PSPACE by Chattarjee etal. [CdAMR08, CdAMR10]. Their solutions exploit the decision procedure for the existential f t bh
111 11 1Figure 1: A probabilistic automaton describing two gablers.
OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 3 fragment of the first-order theory of the reals. It is worth noting that [CHL07, CdAMR08]consider the pseudometric that does not discount the future (a.k.a. undiscounted distance)which entails additional algorithmic challenges. Later, Fu [Fu12] showed that the distanceshave rational values and that computing the discounted distance can be done in polynomialtime by using a value-iteration procedure in combination with the continued fractionalgorithm [Sch99, Section 6]. As for the undiscounted distance, he showed that the thresholdproblem, i.e., deciding whether the distance is smaller than a given rational, is in NP ∩ coNP .The same proof can be adapted to show that the decision problem is in UP ∩ coUP [Fu14],where UP is the subclass of NP -problems with a unique accepting computation. VanBreugel and Worrell [vBW14] have later shown that the problem is in PPAD , which isshort for polynomial parity argument on directed graphs. Notably, their proof exploits acharacterization of the distance as a simple stochastic game. The above algorithms werepresented with the purpose of understanding the complexity of computing bisimilaritydistances and, to the best of our knowledge, they have never been implemented. Theirimplementation would involve either an enumeration of possibly exponentially many fixed-points [Fu14], or the use of SMT solvers over the existential fragment of the first order theoryof the reals [CdAMR08, CdAMR10]. An earlier attempt of approximating the bisimilaritydistance for the more specific case of labelled Markov chains by expressing the problem inthe existential fragment of the first order theory of the reals was proposed in [vBSW08].Its latest implementation using CVC4 [BCD +
11] is able to handle chains with 82 statesin approximately 66 hours . In this paper, we propose an alternative approach that isinspired by the successful implementations of similar pseudometrics on labelled Markovchains [BBLM13, TvB16, TvB18a].Our solution is based on a novel characterization of the probabilistic bisimilarity dis-tance as the solution of a simple stochastic game. Stochastic games were introduced byShapley [Sha53]. A simplified version of these games, called simple stochastic games , werestudied by Condon [Con92]. Several algorithms have been proposed to compute the valuefunction of a simple stochastic game, many using policy iteration. Condon [Con90] proposedan algorithm, known as simple policy iteration , that switches only one non-optimal choiceper iteration. The correctness of Condon’s algorithm, however, relies on the assumptionthat the game is stopping .It turns out that the simple stochastic games characterizing the probabilistic bisimilaritydistances are stopping only when the distances discount the future. In the case the distanceis non-discounting, the corresponding games may not be stopping. To recover correctness ofthe policy iteration procedure we adapt Condon’s simple policy iteration algorithm by addinga non-local update of the strategy of the min player and an extra termination condition basedon a notion of “self-closed” relation due to Fu [Fu12]. The practical efficiency of our algorithmhas been evaluated on a significant set of randomly generated probabilistic automata. Theresults show that our algorithm performs better than the corresponding iterative algorithmsproposed for the discounted distances in [Fu12], even though the theoretical complexityof our proposal is exponential in the worst case ( cf. [TvB16]) whereas Fu’s is polynomial.The implementation of the algorithms exploits a coupling structure characterization of thedistance that allows us to skip the construction of the simple stochastic game which mayresult in an exponential blow up of the memory required for storing the game. The code is available at bitbucket.org/discoveri/first-order . G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL
The two characterizations of probabilistic bisimilarity distances proposed in this paper(either via simple stochastic games or coupling structures) crucially use a dual presentation ofthe Hausdorff distance due to M´emoli [M´em11]. Still using M´emoli’s result, as an additionalcontribution to this paper we show that the (undiscounted) bisimilarity distance can be usedto bound the difference of the maximal (or minimal) probability of two states satisfyingarbitrary ω -regular specifications, expressed, e.g. , as LTL formulas. Notably, this resultallows us to relate the probabilistic bisimilarity pseudometric of Deng et al. to probabilisticmodel checking of probabilistic automata against linear-time specifications. Synopsis.
Section 2 introduces the notation and some preliminary results used in thepaper. In Section 3 we recall the definition of probabilistic bisimilarity distances of Denget al. for probabilistic automata; then, in Section 4 we propose a characterisation of theprobabilistic bisimilarity distances as the values of a simple stochastic game constructed fromthe automaton, here called probabilistic bisimilarity game. Towards an algorithmic solutionfor computing bisimilarity distances, in Section 5 we provide an alternative characterisationof the distances in terms of coupling structures . Section 6 describes a procedure for computingthe bisimilarity distances based on Condon’s simple policy iteration algorithm. In Section 7we discuss the relation between the notion of bisimilarity distance and probabilistic modelchecking of ω -regular linear-time specifications against probabilistic automata. Finally,Section 8 concludes with some remarks and future work directions.2. Preliminaries and Notation
The set of functions f from X to Y is denoted by Y X . We denote by f [ x/y ] ∈ Y X the updateof f at x ∈ X with y ∈ Y , defined by f [ x/y ]( x (cid:48) ) = y if x (cid:48) = x , otherwise f [ x/y ]( x (cid:48) ) = f ( x (cid:48) ).A (1-bounded) pseudometric on a set X is a function d : X × X → [0 ,
1] such that, d ( x, x ) = 0, d ( x, y ) = d ( y, x ), and d ( x, y ) ≤ d ( x, z ) + d ( z, y ) for all x, y, z ∈ X .Kantorovich lifting. A (discrete) probability distribution on X is a function µ : X → [0 , (cid:80) x ∈ X µ ( x ) = 1, and its support is supp ( µ ) = { x ∈ X | µ ( x ) > } . We denoteby D ( X ) the set of probability distributions on X . A pseudometric d on X can be liftedto a pseudometric on probability distributions in D ( X ) by means of the Kantorovichlifting [Vil08].The Kantorovich lifting of d ∈ [0 , X × X on distributions µ, ν ∈ D ( X ) is defined by K ( d )( µ, ν ) = sup (cid:40) (cid:88) x ∈ X f ( x ) · ( µ ( x ) − ν ( x )) | f ∈ L d (cid:41) , ( Kantorovich lifting) where L d denotes the set of non-expansive [0 , X , i.e., functions f : X → [0 ,
1] such that, for all x, y ∈ X , | f ( x ) − f ( y ) | ≤ d ( x, y ).The Kantorovich distance has the following well know dual formulation K ( d )( µ, ν ) = min (cid:88) x,y ∈ X d ( x, y ) · ω ( x, y ) | ω ∈ Ω( µ, ν ) , where Ω( µ, ν ) denotes the set of measure-couplings for the pair ( µ, ν ), i.e., distributions ω ∈ D ( X × X ) such that, for all x ∈ X , (cid:80) y ∈ X ω ( x, y ) = µ ( x ) and (cid:80) y ∈ X ω ( y, x ) = ν ( x ). Itis a well known fact that this dual characterisation can be equivalently stated by ranging OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 5 ω over the set of vertices V (Ω( µ, ν )) of the polytope Ω( µ, ν ). Thus, a minimum is alwaysattained at a vertex. Furthermore, if the set X is finite, the set V (Ω( µ, ν )) is finite too.Hausdorff lifting. A pseudometric d on X can be lifted to nonempty subsets of X by means ofthe Hausdorff lifting. The Hausdorff lifting of d ∈ [0 , X × X on nonempty subsets A, B ⊆ X is defined by H ( d )( A, B ) = max (cid:26) sup a ∈ A inf b ∈ B d ( a, b ) , sup b ∈ B inf a ∈ A d ( a, b ) (cid:27) . ( Hausdorff lifting)
Following M´emoli [M´em11, Lemma 3.1], the Hausdorff lifting has a dual characterizationin terms of set-couplings . Given A, B ⊆ X , a set-coupling for ( A, B ) is a relation R ⊆ X × X with left and right projections respectively equal to A and B , i.e., { a | ∃ b ∈ X. a R b } = A and { b | ∃ a ∈ X. a R b } = B . We write R ( A, B ) for the set of the set-couplings for (
A, B ). Theorem 2.1 ([M´em11]) . H ( d )( A, B ) = inf { sup ( a,b ) ∈ R d ( a, b ) | R ∈ R ( A, B ) } . Clearly, for finite
A, B , inf and sup in Theorem 2.1 can be replaced by min and max,respectively.3.
Probabilistic Automata and Probabilistic Bisimilarity Distance
In this section we recall some definitions and results from the literature. In particular, weintroduce the models of interest, probabilistic automata , its best known behavioural equiva-lence, Segala and Lynch’s probabilistic bisimilarity [SL94], and its quantitative generalizationdue to Deng et al. [DCPP06].A probabilistic automaton is a model of computation that combines nondeterministicand probabilistic behaviours. Similarly to a standard nondeterministic automaton, statesare labelled to express that certain properties of interest hold in that state. A probabilisticautomaton in a current state s ∈ S can nondeterministically proceed to a next probabilisticstate µ ∈ D ( S ), representing the probability distribution with which the automaton willmove to the next state. This can be formalised as follows: Definition 3.1. A probabilistic automaton (PA) is a tuple A = ( S, L, → , (cid:96) ) consisting ofa nonempty finite set S of states, a finite set of labels L , a finite total transition relation → ⊆ S × D ( S ), and a labelling function (cid:96) : S → L .For simplicity we assume the transition relation → to be total, that is, for all s ∈ S ,there exists a µ ∈ D ( S ) such that ( s, µ ) ∈ → . For the remainder of this paper we fix aprobabilistic automaton A = ( S, L, → , (cid:96) ). We write s → µ to denote ( s, µ ) ∈ → and use δ ( s )to denote the set { µ | s → µ } of successor distributions of s .Next we recall the notion of probabilistic bisimilarity due to Segala and Lynch [SL94]for probabilistic automata. Their definition exploits the notion of lifting of a relation R ⊆ S × S on states to a relation ˜ R ⊆ D ( S ) × D ( S ) on probability distributions on states,originally introduced by Jonsson and Larsen [JL91], and defined by µ ˜ R ν if there exists ameasure-coupling ω ∈ Ω( µ, ν ) such that supp ( ω ) ⊆ R . Definition 3.2.
A relation R ⊆ S × S is a probabilistic bisimulation if whenever s R t , • (cid:96) ( s ) = (cid:96) ( t ), M´emoli uses the terminology “correspondence.” To avoid confusion, we adopted the same terminologyused in [PC19, Section 10.6].
G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL • if s → µ then there exists t → ν such that µ ˜ R ν , and • if t → ν then there exists s → µ such that µ ˜ R ν .Two states s, t ∈ S are probabilistic bisimilar , written s ∼ t , if they are related by someprobabilistic bisimulation.Intuitively, two states are probabilistic bisimilar if they have the same label and eachtransition of the one state to a distribution µ can be matched by a transition of the otherstate to a distribution ν assigning the same probability to states that behave the same, andvice versa. Probabilistic bisimilarity is an equivalence relation and the largest probabilisticbisimulation.Deng et al. [DCPP06] proposed a family of 1-bounded pseudometrics d λ , parametric on a discount factor λ ∈ (0 , probabilistic bisimilarity pseudometrics . The pseudometrics d λ are defined as the least fixed-point of the functions ∆ λ : [0 , S × S → [0 , S × S ∆ λ ( d )( s, t ) = (cid:40) (cid:96) ( s ) (cid:54) = (cid:96) ( t ) λ · H ( K ( d ))( δ ( s ) , δ ( t )) otherwise . The well-definition of d λ follows by Knaster-Tarski’s fixed point theorem, given the fact that∆ λ is a monotone function on the complete partial order of [0 , S × S ordered point-wise by d (cid:118) d (cid:48) iff for all s, t ∈ S , d ( s, t ) ≤ d (cid:48) ( s, t ).The fact that probabilistic bisimilarity distances provide a quantitative generalization ofbisimilarity is captured by the following theorem due to Deng et al. [DCPP06, Corollary 2.14]. Theorem 3.3.
For all λ ∈ (0 , , d λ ( s, t ) = 0 if and only if s ∼ t . Probabilistic Bisimilarity Distance as a Simple Stochastic Game A simple stochastic game (SSG) consists of a finite directed graph whose vertices arepartitioned into sets of , , max vertices , min vertices , and random vertices .The game is played by two players, the max player and the min player , with a single token.At each step of the game, the token is moved from a vertex to one of its successors. At amin vertex the min player chooses the successor, at a max vertex the max player choosesthe successor, and at a random vertex the successor is chosen randomly according to aprescribed probability distribution. The max player wins a play of the game if the tokenreaches a 1-sink and the min player wins if the play reaches a 0-sink or continues foreverwithout reaching a sink. Since the game is stochastic, the max player tries to maximize theprobability of reaching a 1-sink whereas the min player tries to minimize that probability. Definition 4.1. A simple stochastic game is a tuple ( V, E, P ) consisting of • a finite directed graph ( V, E ) such that – V is partitioned into the sets: V of , V of , V max of max vertices , V min of min vertices , and V rnd of random vertices ; – the vertices in V and V have outdegree zero and all other vertices have outdegree atleast one, and • a function P : V rnd → D ( V ) such that for all v ∈ V rnd and w ∈ V , P ( v )( w ) > v, w ) ∈ E .The above definition is slightly more general than the one given by Condon in [Con92,Section 2]. Note that the outdegree of min, max and random vertices is at least one (instead OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 7 of exactly two), there may be multiple 0-sinks and 1-sinks (rather than exactly one). However,a simple stochastic game as defined above can be transformed in polynomial-time into asimple stochastic game as defined in [Con92], as shown by Zwick and Paterson [ZP96].A strategy , also known as policy , for the min player is a function σ min : V min → V that assigns the target of an outgoing edge to each min vertex, that is, for all v ∈ V min ,( v, σ min ( v )) ∈ E . Likewise, a strategy for the max player is a function σ max : V max → V thatassigns the target of an outgoing edge to each max vertex. These strategies are known as pure stationary strategies. We can restrict ourselves to these strategies since the optimalamong all strategies for both players are of this type (see, for example, [LL69]).Such strategies determine a sub-game in which each max vertex and each min vertex hasoutdegree one (see [Con92, Section 2] for details). Such a game can naturally be viewed as aMarkov chain. We write φ σ min ,σ max : V → [0 ,
1] for the function that gives the probability ofa vertex in this Markov chain to reach a 1-sink.The value function φ : V → [0 ,
1] of a SSG is defined as min σ min max σ max φ σ min ,σ max . Itis folklore that the value function of a simple stochastic game can be characterised as theleast fixed point of the following function (see, for example, [Jub05, Section 2.2 and 2.3]). Definition 4.2.
The function Φ : [0 , V → [0 , V is defined byΦ( f )( v ) = v ∈ V v ∈ V max ( v,w ) ∈ E f ( w ) if v ∈ V max min ( v,w ) ∈ E f ( w ) if v ∈ V min (cid:80) ( v,w ) ∈ E P ( v )( w ) f ( w ) if v ∈ V rnd The [0 , V can be ordered point-wise by f (cid:118) g iff for all v ∈ V , f ( v ) ≤ g ( v ). This partial order is complete in [0 , V , with meet and join respectively givenby the point-wise infimum and supremum.Then, the existence the least fixed point of Φ is ensured by Knaster-Tarski’s fixed pointtheorem and the following result. Proposition 4.3.
The function Φ is monotone.Proof. Let f, g ∈ [0 , V and f (cid:118) g . Let v ∈ V . It suffices to show that Φ( f )( v ) ≤ Φ( g )( v ).We distinguish the following cases. • If v ∈ V then Φ( f )( v ) = 0 = Φ( g )( v ). • If v ∈ V then Φ( f )( v ) = 1 = Φ( g )( v ). • If v ∈ V max then Φ( f )( v ) = max ( v,w ) ∈ E f ( w ) ≤ max ( v,w ) ∈ E g ( w ) = Φ( g )( v ). • If v ∈ V min then Φ( f )( v ) = min ( v,w ) ∈ E f ( w ) ≤ min ( v,w ) ∈ E g ( w ) = Φ( g )( v ). • If v ∈ V rnd then Φ( f )( v ) = (cid:80) ( v,w ) ∈ E P ( v )( w ) f ( w ) ≤ (cid:80) ( v,w ) ∈ E P ( v )( w ) g ( w ) = Φ( g )( v ).The set [0 , V can be turned into a Banach space by means of the supremum norm (cid:107) f (cid:107) = max v ∈ V f ( v ). Recall that a function F : [0 , V → [0 , V is non-expansive if for all f, g ∈ [0 , V , (cid:107) f − g (cid:107) ≥ (cid:107) F ( f ) − F ( g ) (cid:107) . Proposition 4.4.
The function Φ is nonexpansive. G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL
Proof.
Let f, g ∈ [0 , V . Let v ∈ V . It suffices to show that | Φ( f )( v ) − Φ( g )( v ) | ≤ (cid:107) f − g (cid:107) .We distinguish the following cases. • If v ∈ V then | Φ( f )( v ) − Φ( g )( v ) | = | − | = 0 ≤ (cid:107) f − g (cid:107) . • If v ∈ V then | Φ( f )( v ) − Φ( g )( v ) | = | − | = 0 ≤ (cid:107) f − g (cid:107) . • Let v ∈ V max . Without loss of generality, assume that max ( v,w ) ∈ E f ( w ) ≥ max ( v,w ) ∈ E g ( w ).Then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12) max ( v,w ) ∈ E f ( w ) − max ( v,w ) ∈ E g ( w ) (cid:12)(cid:12)(cid:12)(cid:12) = max ( v,w ) ∈ E f ( w ) − max ( v,w ) ∈ E g ( w )= f ( x ) − max ( v,w ) ∈ E g ( w ) ≤ f ( x ) − g ( x ) ≤ (cid:107) f − g (cid:107) , where x ∈ V realises the maximum of { f ( w ) | ( v, w ) ∈ E } . • Let v ∈ V min . Without loss of generality, assume that min ( v,w ) ∈ E f ( w ) ≥ min ( v,w ) ∈ E g ( w ).Then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12) min ( v,w ) ∈ E f ( w ) − min ( v,w ) ∈ E g ( w ) (cid:12)(cid:12)(cid:12)(cid:12) = min ( v,w ) ∈ E f ( w ) − min ( v,w ) ∈ E g ( w )= min ( v,w ) ∈ E f ( w ) − g ( x ) ≤ f ( x ) − g ( x ) ≤ (cid:107) f − g (cid:107) , where x ∈ V realises the minimum of { g ( w ) | ( v, w ) ∈ E } . • If v ∈ V rnd then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) ( v,w ) ∈ E P ( v )( w ) f ( w ) − (cid:88) ( v,w ) ∈ E P ( v )( w ) g ( w ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) ( v,w ) ∈ E P ( v )( w ) ( f ( w ) − g ( w )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) ( v,w ) ∈ E P ( v )( w ) (cid:107) f − g (cid:107)≤ (cid:107) f − g (cid:107) . A Probabilistic Bisimilarity Game.
Fix a probabilistic automaton A and λ ∈ (0 , probabilistic bisimilarity game , where the min player tries to showthat two states are probabilistic bisimilar, while the max player tries to prove the opposite.In our probabilistic bisimilarity game, there is a vertex ( s, t ) for each pair states s and t in A . If (cid:96) ( s ) (cid:54) = (cid:96) ( t ) then the vertex ( s, t ) is a 1-sink. Otherwise, ( s, t ) is a min vertex. In OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 9 this vertex, the min player selects a set R ∈ R ( δ ( s ) , δ ( t )) of pairs of transitions. This set R captures potential matchings of transitions from state s and state t . Subsequently, the maxplayer chooses a pair of transitions from the set R . Once the max player has chosen a pair( µ, ν ) from the set R corresponding to the transitions s → µ and t → ν , the min player canchoose a measure-coupling ω ∈ Ω( µ, ν ). To ensure that the game graph is finite, we restrictour attention to the vertices V (Ω( µ, ν )) of the polytope Ω( µ, ν ). Such a measure-coupling ω captures a matching of the probability distributions µ and ν . Recall that a measure-couplingis a probability distribution on S × S . From a random vertex ω , the game proceeds tovertex ( u, v ) with probability λ · ω ( u, v ) and to the 0-sink vertex ⊥ with probability 1 − λ .Intuitively, the choices of R ∈ R ( δ ( s ) , δ ( t )) and then ( µ, ν ) ∈ R , performed respectively bythe min and the max player, correspond to the min and max of Theorem 2.1; analogously,the selection of ω ∈ V (Ω( µ, ν )) by the min player models the min in the definition of theKantorovich lifting.Formally, our probabilistic bisimilarity game for the automaton A is defined as follows. Definition 4.5.
Let λ ∈ (0 , probabilistic bisimilarity game ( V, E, P ) is defined by • V = {⊥} , • V = (cid:8) ( s, t ) ∈ S × S | (cid:96) ( s ) (cid:54) = (cid:96) ( t ) (cid:9) , • V max = (cid:83) (cid:8) R ( δ ( s ) , δ ( t )) | ( s, t ) ∈ V min (cid:9) , • V min = (cid:8) ( s, t ) ∈ S × S | (cid:96) ( s ) = (cid:96) ( t ) (cid:9) ∪ (cid:83) (cid:8) R | R ∈ V max (cid:9) , • V rnd = (cid:83) (cid:8) V (Ω( µ, ν )) | ( µ, ν ) ∈ V min (cid:9) , E = (cid:8) (( s, t ) , R ) | ( s, t ) ∈ V min ∧ R ∈ R ( δ ( s ) , δ ( t )) (cid:9) ∪ (cid:8) ( R, ( µ, ν )) | R ∈ V max ∧ ( µ, ν ) ∈ R (cid:9) ∪ (cid:8) (( µ, ν ) , ω ) | ( µ, ν ) ∈ V min ∧ ω ∈ V (Ω( µ, ν )) (cid:9) ∪ (cid:8) ( ω, ( u, v )) | ω ∈ V rnd ∧ ( u, v ) ∈ supp ( ω ) (cid:9) ∪ (cid:8) ( ω, ⊥ ) | ω ∈ V rnd (cid:9) , and, for all ω ∈ V rnd and ( s, t ) ∈ supp ( ω ), P ( ω )(( s, t )) = λ · ω ( s, t ) and P ( ω )( ⊥ ) = 1 − λ .By construction of the probabilistic bisimilarity game, there is a direct correspondencebetween the function Φ from Definition 4.2 associated to the probabilistic bisimilarity gameand the function ∆ λ from Section 3 associated to the probabilistic automaton. From thiscorrespondence it is straightforward that the respective least fixed points of Φ and ∆ λ agree,that is, the probabilistic bisimilarity distances of a probabilistic automaton are the values ofthe corresponding vertices of the probabilistic bisimilarity game. Theorem 4.6.
For all λ ∈ (0 , and s, t ∈ S , d λ ( s, t ) = φ ( s, t ) .Proof. The proof is similar to that of [vBW14, Theorem 14].Let φ be the value function of the probabilistic bisimilarity game. Since Φ is monotoneand non-expansive (Propositions 4.3 and 4.4), we conclude from [vB12, Corollary 1] thatthe closure ordinal of Φ is ω , that is, φ is the least upper bound of { Φ n ( ) | n ∈ N } ,where the function maps every vertex to zero. Similarly, d λ is the least upper bound of { ∆ nλ ( ) | n ∈ N } , where the function maps every state pair to zero. Therefore, it sufficesto show that for all s, t ∈ S and n ∈ N ,Φ n ( )( s, t ) = ∆ nλ ( )( s, t )by induction on n . Obviously, the above holds if n = 0. Let n >
0. We distinguish thefollowing cases. tu v
12 12
11 1 R = { ( t , u ) , ( u + v , u ) } , R (cid:48) = { ( u , u ) } . ( t, u ) R (cid:0) t , u (cid:1) (cid:0) u + v , u (cid:1) ( t,u ) 12 ( u,u ) + ( v,u ) ( u, u ) ( v, u ) R (cid:48) ( u , u ) ( u,u )
12 12
Figure 2: (Top left:) A probabilistic automaton and (Right:) the associated simple stochasticgame constructed as in Definition 4.5 for λ = 1 (only the portion reachable from( t, u ) is shown), where x denotes the Dirac distribution concentrated at x . • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ) then the vertex ( s, t ) is a 1-sink and, hence, Φ n ( )( s, t ) = 1 = ∆ nλ ( )( s, t ) . • If (cid:96) ( s ) = (cid:96) ( t ) thenΦ n ( )( s, t ) == min R ∈R ( δ ( s ) ,δ ( t )) Φ n − ( )( R )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R Φ n − ( )( µ, ν )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) Φ n − ( )( ω )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) (cid:88) ( u,v ) ∈ supp ( ω ) λω ( u, v ) Φ n − ( )( u, v ) + (1 − λ )Φ n − ( )( ⊥ )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) λ (cid:88) u,v ∈ S ω ( u, v ) Φ n − ( )( u, v ) ( ⊥ is a 0-sink)= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) λ (cid:88) u,v ∈ S ω ( u, v ) ∆ n − λ ( )( u, v ) (by induction)= λ · min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R K (∆ n − λ ( ))( µ, ν )= λ · H ( K (∆ n − λ ( )))( δ ( s ) , δ ( t )) (Theorem 2.1)= ∆ nλ ( )( s, t ) . Consider a state pair ( s, t ) with s ∼ t . By Theorem 3.3, d λ ( s, t ) = 0. Hence, fromTheorem 4.6 we can conclude that φ ( s, t ) = 0. Therefore, by pre-computing probabilisticbisimilarity, ( s, t ) can be represented as a 0-sink, rather than a min vertex. For example, inFigure 2 this amounts to turning ( u, u ) into a 0-sink and disconnecting it from its successors.Games similar to the above introduced probabilistic bisimilarity game have been pre-sented in [DLT08, vBW14, FKP17, KM18]. The game presented by van Breugel and Worrell OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 11 in [vBW14] is most closely related to our game. They also consider probabilistic automataand map a probabilistic automaton to a simple stochastic game. The only difference is thatthey use the original definition of the Hausdorff distance, whereas we use M´emoli’s alterna-tive characterization. The games described in [DLT08, FKP17, KM18] are not stochastic.Desharnais, Laviolette and Tracol [DLT08] define an (cid:15) -probabilistic bisimulation game forprobabilistic automata, where (cid:15) >
A Coupling Characterisation of the Bisimilarity Distance
In this section we provide an alternative characterisation for the probabilistic bisimilaritydistance d λ based on the notion of coupling structure for a probabilistic automaton. Thischaracterisation generalises the one by Chen et al. [CvBW12, Theorem 8] (see also [BBLM13,Theorem 8]) for the bisimilarity pseudometric of Desharnais et al. [DGJP04] for labelledMarkov chains. Our construction exploits M´emoli’s dual characterisation of the Hausdorffdistance (Theorem 2.1). Definition 5.1. A coupling structure for A is a tuple C = ( f, ρ ) consisting of • a map f : D ( S ) × D ( S ) → D ( S × S ) such that, for all µ, ν ∈ D ( S ), f ( µ, ν ) ∈ Ω( µ, ν ), and • a map ρ : S × S → D ( S ) ×D ( S ) , such that for all s, t ∈ S , ρ ( s, t ) ∈ R ( δ ( s ) , δ ( t )).For convenience, the components f and ρ of a coupling structure will be called measure-coupling map and set-coupling map , respectively.The definition of coupling structure is better understood in relation to the automaton itinduces. The probabilistic automaton induced from C = ( f, ρ ), denoted A C = ( S × S, L × L, → C , (cid:96) C ) , has S × S as set of states, L × L as set of labels, transition relation → C ⊆ ( S × S ) × D ( S × S ),defined as ( s, t ) → C f ( µ, ν ) if ( µ, ν ) ∈ ρ ( s, t ), and labeling function (cid:96) C : S × S → L × L defined as (cid:96) C ( s, t ) = ( (cid:96) ( s ) , (cid:96) ( t )). Intuitively, A C describes the concurrent execution of twocopies of the probabilistic automaton A , synchronized by the coupling structure C .Let λ ∈ (0 , C we define the function Γ C λ : [0 , S × S → [0 , S × S asΓ C λ ( d )( s, t ) = (cid:40) (cid:96) ( s ) (cid:54) = (cid:96) ( t ) λ · max { (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } otherwise. Lemma 5.2.
The function Γ C λ is well-defined and monotone.Proof. The well definition of Γ C λ follows by the fact that λ ∈ (0 ,
1] and (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )is a convex combination of a sequence of [0 , d ( u, v )) u,v ∈ S .As for monotonicity, let d, d (cid:48) ∈ [0 , S × S and d (cid:118) d (cid:48) . Let s, t ∈ S , it suffices to show thatΓ C λ ( d )( s, t ) ≤ Γ C λ ( d (cid:48) )( s, t ). We distinguish the following cases: • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then Γ C λ ( d )( s, t ) = 1 = Γ C λ ( d (cid:48) )( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ), then we haveΓ C λ ( d )( s, t ) = λ · max { (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } = λ (cid:80) u,v ∈ S d ( u, v ) · ω ∗ ( u, v ) (for some ( s, t ) → C ω ∗ ) ≤ λ (cid:80) u,v ∈ S d (cid:48) ( u, v ) · ω ∗ ( u, v ) ( d (cid:118) d (cid:48) and ω ∗ ( u, v ) ≥ u, v ∈ S ) ≤ λ · max { (cid:80) u,v ∈ S d (cid:48) ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (( s, t ) → C ω ∗ )= Γ C λ ( d (cid:48) )( s, t ) . By Knaster-Tarski’s fixed point theorem, Γ C λ has a least fixed point, denoted by γ C λ . Asin [BBLM13], we call γ C λ the λ -discounted discrepancy w.r.t. C or simply λ -discrepancy . Remark 5.3.
Note that, the 1-discrepancy γ C ( s, t ) is the maximal probability of reachinga pair of states ( u, v ) in the probabilistic automaton A C such that (cid:96) ( u ) (cid:54) = (cid:96) ( v ) by startingfrom the state pair ( s, t ). It is well known that the maximal reachability probability canbe computed in polynomial-time as the optimal solution of a linear program (see [BK08,Theorem 10.100] or [Put94, Chapter 6]). The linear program can be trivially generalized tocompute γ C λ , for any λ ∈ (0 , Lemma 5.4.
For all λ ∈ (0 , and coupling structure C of A , ∆ λ ( γ C λ ) (cid:118) γ C λ .Proof. Let C = ( f, ρ ). Let s, t ∈ S and R = ρ ( s, t ). We distinguish two cases. • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then ∆ λ ( γ C λ )( s, t ) = 1 = γ C λ ( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ),∆ λ ( γ C λ )( s, t ) = λ · H ( K ( γ C λ ))( δ ( s ) , δ ( t )) (def. ∆ λ )= λ · min { max ( µ,ν ) ∈ R (cid:48) K ( γ C λ )( µ, ν ) | R (cid:48) ∈ R ( δ ( s ) , δ ( t )) } (Theorem 2.1) ≤ λ · max ( µ,ν ) ∈ R K ( γ C λ )( µ, ν ) ( R ∈ R ( δ ( s ) , δ ( t )))= λ · max ( µ,ν ) ∈ R min ω ∈ Ω( µ,ν ) (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) (def. K ( γ C λ )) ≤ λ · max ( µ,ν ) ∈ R (cid:80) u,v ∈ S γ C λ ( u, v ) · f ( µ, ν )( u, v ) ( f ( µ, ν ) ∈ Ω( µ, ν ))= λ · max { (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (def. → C )= Γ C λ ( γ C λ )( s, t ) (def. Γ C λ )= γ C λ ( s, t ) . ( γ C λ fixed point of Γ C λ )By the generality of the chosen s and t , we conclude that ∆ λ ( γ C λ ) (cid:118) γ C λ . Corollary 5.5.
For all λ ∈ (0 , and coupling structure C for A , d λ (cid:118) γ C λ .Proof. By Knaster-Tarski’s fixed point theorem, d λ is the least prefix point of ∆ λ , thereforeby Lemma 5.4 we can conclude that d λ (cid:118) γ C λ .The next lemma shows that the probabilistic bisimilarity distance can be characterisedas the λ -discrepancy for a vertex coupling structure , that is, a coupling structure C = ( f, ρ )such that f ( µ, ν ) ∈ V (Ω( µ, ν )) for all µ, ν ∈ D ( S ). Lemma 5.6.
For all λ ∈ (0 , there exists a vertex coupling structure C for A such that d λ = γ C λ . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 13
Proof.
We construct a vertex coupling structure C = ( f, ρ ) as follows.We define f : D ( S ) × D ( S ) → D ( S × S ) by f ( µ, ν ) ∈ argmin ω ∈ V (Ω( µ,ν )) (cid:80) u,v ∈ S d λ ( u, v ) · ω ( u, v ) . Hence, K ( d λ )( µ, ν ) = (cid:80) u,v ∈ S d λ ( u, v ) · f ( µ, ν )( u, v ) . (5.1)We define ρ : S × S → D ( S ) ×D ( S ) by ρ ( s, t ) = (cid:8)(cid:0) µ, argmin ν ∈ δ ( t ) K ( d λ )( µ, ν ) (cid:1) | µ ∈ δ ( s ) (cid:9) ∪ (cid:8)(cid:0) argmin µ ∈ δ ( s ) K ( d λ )( µ, ν ) , ν (cid:1) | ν ∈ δ ( t ) (cid:9) . Hence, ρ ( s, t ) ∈ R ( δ ( s ) , δ ( t )) and H ( K ( d λ ))( δ ( s ) , δ ( t )) = max {K ( d λ )( µ, ν ) | ( µ, ν ) ∈ ρ ( s, t ) } . (5.2)Next, we show that Γ C λ ( d λ ) (cid:118) d λ . Let s, t ∈ S . We distinguish two cases: • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then d λ ( s, t ) = ∆ λ ( d λ )( s, t ) = 1 = Γ C λ ( d λ )( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ), we haveΓ C λ ( d λ )( s, t ) = λ · max { (cid:80) u,v ∈ S d λ ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (def. Γ C λ )= λ · max { (cid:80) u,v ∈ S d λ ( u, v ) · f ( µ, ν )( u, v ) | ( µ, ν ) ∈ ρ ( s, t ) } (def. → C )= λ · max {K ( d λ )( µ, ν ) | ( µ, ν ) ∈ ρ ( s, t ) } (eq. (5.1))= λ · H ( K ( d λ ))( δ ( s ) , δ ( t )) (eq. (5.2))= d λ ( s, t ) . ( d λ fixed point of ∆ λ )Therefore Γ C λ ( d λ ) = d λ . Since γ C λ is the least fixed point of Γ C λ , by Knaster-Tarski’s fixedpoint theorem γ C λ (cid:118) d λ . Moreover, by Corollary 5.5, d λ (cid:118) γ C λ . Thus d λ = γ C λ . Theorem 5.7.
Let λ ∈ (0 , . Then, the following hold: (1) d λ = (cid:117){ γ C λ | C coupling structure for A} ; (2) s ∼ t iff γ C λ ( s, t ) = 0 for some vertex coupling structure C for A .Proof. (1) follows by Corollary 5.5 and Lemma 5.6; (2) by Theorem 3.3 and Lemma 5.6.Note that together with Lemma 5.6, Theorem 5.7.1 states that d λ is the minimal λ -discrepancy obtained by ranging over the subset of vertex coupling structures. Remark 5.8 (On the relation with probabilistic bisimilarity games) . The coupling structurecharacterization of the distance is strongly related to the simple stochastic game charac-terization presented in Section 4. Indeed, the notion of vertex coupling structure capturesessentially the strategies for the min player on a probabilistic bisimilarity game in thefollowing sense: the measure-coupling map component describes the strategy on the verticesof the form ( µ, ν ) ∈ R for some R ∈ V max , while the set-coupling map deals with thedescription of the strategy on the min vertices ( s, t ) ∈ S × S . The discrepancy γ C capturesthe value w.r.t an optimal strategy for the max player when the min player has fixed theirstrategy a priori. Computing the Bisimilarity Distance
We describe a procedure for computing the bisimilarity distances based on Condon’s simplepolicy iteration algorithm [Con90]. Our procedure extends a similar one proposed in [TvB16,BBLM13] for computing the bisimilarity distances of Desharnais et al. [DGJP04] for labelledMarkov chains. The extension takes into account the additional presence of nondeterminismin the choice of the transitions.Condon’s simple policy iteration algorithm computes the values of a simple stochasticgame provided that the game is stopping , i.e., for each pair of strategies for the min andmax players the token reaches a 0-sink or 1-sink vertex with probability one.As we have shown in Theorem 4.6, the probabilistic bisimilarity distances are the valuesof the corresponding vertices in the simple stochastic game given in Definition 4.5. Thus, ifwe prove that the game is stopping we can apply Condon’s simple policy iteration algorithmto compute the probabilistic bisimilarity distances.
Proposition 6.1.
For λ ∈ (0 , , the simple stochastic game in Definition 4.5 is stopping.Proof. For each pair of strategies for the min and max players, each vertex in the inducedMarkov chain reaches the 0-sink vertex ⊥ with probability at least 1 − λ . Since λ <
1, fromany state, the probability of never reaching ⊥ is zero, i.e., the probability of eventuallyreaching the sink state ⊥ is one.However, for λ = 1 the game in Definition 4.5 may not be stopping as shown below. Example 6.2.
Consider the probabilistic automaton in Figure 2 and its associated prob-abilistic bisimilarity game. By choosing a strategy σ max for the max player such that σ max ( R ) = ( t , u ), the vertex ( t, u ) has probability zero to reach a sink. This can be seenin Figure 2, since there are no paths using the edge ( R, ( t , u )) leading to a sink.In [TvB16], by imposing the bisimilar state pairs to be 0-sinks, for the case of labelledMarkov chains the simple stochastic game was proven to be stopping. This method does notgeneralize to probabilistic automata. Indeed, Example 6.2 provides a counterexample evenwhen bisimilar state pairs are 0-sinks.In the remainder of the section, we provide a general algorithm to compute the bisimilaritydistance for every λ ∈ (0 , Simple Policy Iteration Strategy.
Condon’s algorithm iteratively updates thestrategies of the min and max players in turn, on the basis of the current over-approximationof the value of the game. Next we show how Condon’s policy updates can be performeddirectly on coupling structures.For the update of the coupling structure, we use a measure-coupling map k ( d )( µ, ν ) ∈ V (Ω( µ, ν )) and a set-coupling map h ( d )( s, t ) ∈ R ( δ ( s ) , δ ( t )) such that k ( d )( µ, ν ) ∈ argmin (cid:110) (cid:80) u,v ∈ S ω ( u, v ) · d ( u, v ) | ω ∈ V (Ω( µ, ν )) (cid:111) , and (6.1) h ( d )( s, t ) ∈ argmin (cid:110) max ( µ,ν ) ∈ R K ( d )( µ, ν ) | R ∈ R ( δ ( s ) , δ ( t )) (cid:111) . (6.2)for d : S × S → [0 , µ, ν ∈ D ( S ), and s, t ∈ S . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 15
The following lemma explains how the above ingredients can be used by the min playerto improve its strategy.
Lemma 6.3.
Let C = ( f, ρ ) . If there exist s, t ∈ S such that ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ) then, γ D λ (cid:60) γ C λ for a coupling structure D = ( k ( γ C λ ) , ρ [( s, t ) /R ]) , where R = h ( γ C λ )( s, t ) .Proof. Assume ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ). Next we show Γ D λ ( γ C λ ) (cid:60) γ C λ . In particular we provethat Γ D λ ( γ C λ )( s, t ) < γ C λ ( s, t ) and, for all ( u, v ) (cid:54) = ( s, t ), Γ D λ ( γ C λ )( u, v ) ≤ γ C λ ( u, v ).By ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ), we necessarily have (cid:96) ( s ) = (cid:96) ( t ). Thus∆ λ ( γ C λ )( s, t ) = λ · H ( K ( γ C λ ))( δ ( s ) , δ ( t )) ( (cid:96) ( s ) = (cid:96) ( t ) and def. ∆ λ )= λ · min { max ( µ,ν ) ∈ R (cid:48) K ( γ C λ )( µ, ν ) | R (cid:48) ∈ R ( δ ( s ) , δ ( t )) } (Theorem 2.1)= λ · max ( µ,ν ) ∈ R K ( γ C λ )( µ, ν ) ( R = h ( γ C λ )( s, t ) and (6.2))= λ · max ( µ,ν ) ∈ R min ω ∈ Ω( µ,ν ) (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) (def. K ( γ C λ ))= λ · max ( µ,ν ) ∈ R (cid:80) u,v ∈ S γ C λ ( u, v ) · k ( γ C λ )( µ, ν )( u, v ) (by (6.1))= Γ D λ ( γ C λ )( s, t ) . (def. D and Γ D λ )Therefore, Γ D λ ( γ C λ )( s, t ) = ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ).Let u, v ∈ S such that ( u, v ) (cid:54) = ( s, t ). We distinguish two cases. • If (cid:96) ( u ) (cid:54) = (cid:96) ( v ), then Γ D λ ( γ C λ )( u, v ) = 1 = Γ C λ ( γ C λ )( u, v ) = γ C λ ( u, v ). • If (cid:96) ( u ) = (cid:96) ( v ), thenΓ D λ ( γ C λ )( u, v ) = λ · max ( µ,ν ) ∈ ρ ( u,v ) (cid:80) x,y ∈ S k ( γ C λ )( µ, ν )( x, y ) · γ C λ ( x, y ) (def. Γ D λ and D ) ≤ λ · max ( µ,ν ) ∈ ρ ( u,v ) (cid:80) x,y ∈ S f ( µ, ν )( x, y ) · γ C λ ( x, y ) ((6.1), f ( µ, ν ) ∈ Ω( µ, ν ))= Γ C λ ( γ C λ )( u, v ) = γ C λ ( u, v ) . (def. Γ C λ and γ C λ )Thus Γ D λ ( γ C λ ) (cid:60) γ C λ . By Knaster-Tarski’s fixed point theorem, we conclude that γ D λ (cid:60) γ C λ .Lemma 6.3 suggests that C = ( f, ρ ) can be improved by replacing the measure-couplingmap f with k ( γ C λ ) and updating the set-coupling map ρ at ( s, t ) with R = h ( γ C λ )( s, t ).Note that a measure-coupling k ( d )( µ, ν ) satisfying (6.1) can be computed by solv-ing a linear program and ensuring that the optimal solution is a vertex of the polytopeΩ( µ, ν ) [Orl85, KS95]. A set-coupling h ( d )( s, t ) satisfying (6.2) is the following: R = (cid:8) ( µ, φ ( µ )) | µ ∈ δ ( s ) (cid:9) ∪ (cid:8) ( ψ ( ν ) , ν ) | ν ∈ δ ( t ) (cid:9) ∈ R ( δ ( s ) , δ ( t )) , (6.3)where φ, ψ are such that φ ( µ ) ∈ argmin ν ∈ δ ( t ) K ( d )( µ, ν ) and ψ ( ν ) ∈ argmin µ ∈ δ ( s ) K ( d )( µ, ν ).The following lemma justifies our choice of h ( d )( s, t ). Lemma 6.4.
Let R be as in (6.3) . Then H ( K ( d ))( δ ( s ) , δ ( t )) = max ( µ,ν ) ∈ R K ( d )( µ, ν ) .Proof. By Theorem 2.1 and R ∈ R ( δ ( s ) , δ ( t )), we have H ( K ( d ))( δ ( s ) , δ ( t )) ≤ max ( µ,ν ) ∈ R K ( d )( µ, ν ) . Hence, it suffices to prove(i) K ( d )( µ, φ ( µ )) ≤ H ( K ( d ))( δ ( s ) , δ ( t )), for all µ ∈ δ ( s ), and(ii) K ( d )( ψ ( ν ) , ν ) ≤ H ( K ( d ))( δ ( s ) , δ ( t )), for all ν ∈ δ ( t ). Algorithm 1:
Simple policy iteration algorithm computing d λ for λ ∈ (0 , Initialise C = ( f, ρ ) as an arbitrary vertex coupling structure for A while ∃ ( s, t ) . ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ) do R ← h ( γ C λ )( s, t ) C ← (cid:0) k ( γ C λ ) , ρ [( s, t ) /R ] (cid:1) /* update coupling structure */ end return γ C λ /* γ C λ = d λ */ We prove (i). Let µ ∈ δ ( s ). Then H ( K ( d ))( δ ( s ) , δ ( t )) ≥ max µ (cid:48) ∈ δ ( s ) min ν ∈ δ ( t ) K ( d )( µ (cid:48) , ν ) (def. H ) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) ( µ ∈ δ ( s ))= K ( d )( µ, φ ( µ )) ( φ ( µ ) ∈ argmin ν ∈ δ ( t ) K ( d )( µ, ν ))The proof for (ii) follows similarly. Remark 6.5.
The update procedure entailed by Lemma 6.3 can be performed in polynomial-time in the size of the probabilistic automaton A . Indeed, k ( d )( µ, ν ) can be obtained bysolving a transportation problem in polynomial time [Orl85, KS95]. As for h ( d )( s, t ), onecan obtain φ ( µ ) (resp. ψ ( ν )) by computing K ( d )( µ, ν ) in polynomial time and selecting the ν (resp. µ ) ranging over δ ( t ) (resp. δ ( s )) that achieves the minimum.6.2. Discounted case.
The simple policy iteration algorithm for computing d λ in the case λ < C (line 1), e.g. , by using the North-West corner method in polynomialtime (see, e.g. , [Str89, pg. 180]). Then it continues by iteratively generating a sequence C , C , . . . , C n of vertex coupling structures where d λ = γ C n λ . At each iteration, the currentcoupling structure C i is tested for optimality (line 2) by checking whether the corresponding λ -discrepancy γ C i λ is a fixed point for ∆ λ . If there exists ( s, t ) ∈ S violating the equality γ C i λ = ∆ λ ( γ C i λ ), it constructs C i +1 by updating C i at ( s, t ) as prescribed by Lemma 6.3 (line 4).This guarantees that γ C i λ (cid:61) γ C i +1 λ , i.e., a strict improvement of the λ -discrepancy towardsthe minimal one.Termination follows by the fact that there are only finitely many vertex couplingstructures for A . Furthermore, the correctness of the output of the algorithm is due to thefact that, ∆ λ has a unique fixed point when 0 ≤ λ < Theorem 6.6.
Let λ ∈ (0 , . Algorithm 1 is terminates and computes d λ .Proof. First we prove termination. Note that the set { γ C λ | C vertex coupling structure for A} (6.4)is finite because for all s, t ∈ S the set R ( δ ( s ) , δ ( t )) is finite, and for all µ ∈ δ ( s ) and ν ∈ δ ( t ) the polytope Ω( µ, ν ) has finitely many vertices, i.e., V (Ω( µ, ν )) is finite. Towards acontradiction, assume that Algorithm 1 does not terminate. Let C , C , C , . . . be the infinitesequence of coupling structures generated during a non-terminating execution of Algorithm 1.Since the set in (6.4) is finite, there must be i < j such that γ C i λ = γ C j λ . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 17
On the contrary, next we prove that the updates of the coupling structures in Algorithm 1ensure that for all n ∈ N , γ C n λ (cid:61) γ C n +1 λ . Let n ∈ N . Since we are considering a non-terminatingexecution we have that ∆ λ ( γ C n λ )( s, t ) < γ C n λ ( s, t ), for some s, t ∈ S . C n +1 is obtained fromthe update performed in line 4, which is exactly the one prescribed by Lemma 6.3. Thereforewe have that γ C n λ (cid:61) γ C n +1 λ . Hence, Algorithm 1 terminates.When the execution of Algorithm 1 has reached the return statement, we have that∆ λ ( γ C n λ )( s, t ) ≥ γ C n λ ( s, t ) for all s, t ∈ S , i.e., γ C n λ (cid:118) ∆ λ ( γ C n ). By Lemma 5.4, γ C n λ (cid:119) ∆ λ ( γ C n λ ), therefore γ C n λ = ∆ λ ( γ C n λ ). The operator ∆ λ is λ -Lipschitz continuous [Tan18,Proposition 10.3.2(b)] thus, by Banach’s fixed-point theorem, ∆ λ has a unique fixed point.Hence, γ C λ = d λ .6.3. Undiscounted case.
For λ = 1, the termination condition of the simple policy-iteration algorithm of Section 6.1 is not sufficient to guarantee correctness, since Algorithm 1may terminate prematurely by returning a fixed point of ∆ that is not the minimal one.Towards a way to obtain a stronger termination condition, we introduce the notion ofself-closed relations w.r.t. a fixed point for ∆ , originally due to [Fu12]. Definition 6.7.
A relation M ⊆ S × S is self-closed w.r.t. d = ∆ ( d ) if, whenever s M t , • (cid:96) ( s ) = (cid:96) ( t ) and d ( s, t ) > • if s → µ and d ( s, t ) = min ν (cid:48) ∈ δ ( t ) K ( d )( µ, ν (cid:48) ) then there exists t → ν and ω ∈ Ω( µ, ν ) suchthat d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M , • if t → ν and d ( s, t ) = min µ (cid:48) ∈ δ ( s ) K ( d )( µ (cid:48) , ν ) then there exists s → µ and ω ∈ Ω( µ, ν ) suchthat d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M .Two states are self-closed w.r.t d , written s ≈ d t , if they are related by some self-closedrelation w.r.t. d .Let M be a self-closed set w.r.t d where d = γ C for some coupling structure C . In theprobabilistic automaton A C , a state pair ( s, t ) ∈ M can reach ω with supp ( ω ) ⊆ M . As wewill see in the proof of Lemma 6.8, this allows us to reduce all d ( s, t ) ( s,t ) ∈ M simultaneouslyby a small amount so that d still is a prefix point of ∆ .It can be easily shown that ≈ d is the largest self-closed relation w.r.t. d . Note that theconcept of self-closeness above is defined only for fixed points of ∆ . As remarked in [Fu12],the largest self-closed relation ≈ d can be computed in polynomial time by using partitionrefinement techniques similar to those employed to compute the largest bisimilarity relation.The next lemma states that if for a fixed point d = ∆ ( d ) the relation ≈ d is nonempty,then d is not the least fixed point of ∆ . Lemma 6.8.
Let d = ∆ ( d ) . If there exists a nonempty self-closed relation M w.r.t. d , thenthere exists d M (cid:60) d such that ∆ ( d M ) (cid:118) d M . Moreover, d M can be computed in polynomialtime in the size of the probabilistic automaton A .Proof. Let M be a nonempty self-closed relation w.r.t. d . For arbitrary s, t ∈ S , µ ∈ δ ( s ),and ν ∈ δ ( t ), define θ s ( µ, t ) = d ( s, t ) − min ν ∈ δ ( t ) K ( d )( µ, ν ) and θ t ( s, ν ) = d ( s, t ) − min µ ∈ δ ( s ) K ( d )( µ, ν ) . Note that, θ s ( µ, t ) and θ t ( s, ν ) are non-negative since d = ∆ ( d ). Let θ = min { θ , θ , θ } where • θ = min { θ s ( µ, t ) | ( s, t ) ∈ M ∧ µ ∈ δ ( s ) ∧ θ s ( µ, t ) > } ; • θ = min { θ t ( s, ν ) | ( s, t ) ∈ M ∧ ν ∈ δ ( t ) ∧ θ t ( s, ν ) > } ; • θ = min { d ( s, t ) | ( s, t ) ∈ M } ;where min ∅ = 1. Note that θ >
0, because M is a nonempty self-closed relation w.r.t. d .Therefore θ >
0. We define the map d M : S × S → [0 ,
1] as d M ( s, t ) = (cid:40) d ( s, t ) − θ if ( s, t ) ∈ Md ( s, t ) if ( s, t ) / ∈ M It is clear that d M is well-defined. Moreover d M (cid:60) d because M is nonempty and θ > ( d M ) (cid:118) d M . Let s, t ∈ S . We consider two cases: • Assume ( s, t ) / ∈ M . Then∆ ( d M )( s, t ) ≤ ∆ ( d )( s, t ) (by d M (cid:118) d and ∆ monotone)= d ( s, t ) ( d = ∆ ( d ))= d M ( s, t ) (( s, t ) / ∈ M ) • Assume ( s, t ) ∈ M . Then (cid:96) ( s ) = (cid:96) ( t ). Let µ ∈ δ ( s ). We consider two subcases below:(1) If θ s ( µ, t ) > d M ( s, t ) = d ( s, t ) − θ (def. d M ) ≥ d ( s, t ) − θ s ( µ, t ) (0 < θ ≤ θ s ( µ, t ))= min ν ∈ δ ( t ) K ( d )( µ, ν ) (def. θ s ( µ, t )) ≥ min ν ∈ δ ( t ) K ( d M )( µ, ν ) ( d M (cid:118) d and K monotone)(2) If θ s ( µ, t ) = 0, then d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν ). Since M is self-closed w.r.t. d , thereexists ν (cid:48) ∈ δ ( t ) such that d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ), for some ω ∈ Ω( µ, ν (cid:48) ) suchthat supp ( ω ) ⊆ M . Thusmin ν ∈ δ ( t ) K ( d M )( µ, ν ) ≤ K ( d M )( µ, ν (cid:48) ) ( ν (cid:48) ∈ δ ( t ))= min ω (cid:48) ∈ Ω( µ,ν (cid:48) ) (cid:88) u,v ∈ S d M ( u, v ) · ω (cid:48) ( u, v ) (def. K ( d M )) ≤ (cid:88) u,v ∈ S d M ( u, v ) · ω ( u, v ) ( ω ∈ Ω( µ, ν (cid:48) ))= (cid:88) u,v ∈ M d M ( u, v ) · ω ( u, v ) ( supp ( ω ) ⊆ M )= (cid:88) u,v ∈ M ( d ( u, v ) − θ ) · ω ( u, v ) (def. d M )= (cid:16) (cid:88) u,v ∈ M d ( u, v ) · ω ( u, v ) (cid:17) − θ ( (cid:80) u,v ∈ M ω ( u, v ) = 1)= d ( s, t ) − θ ( d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ))= d M ( s, t ) (def. d M )So that, in both cases 1 and 2 we have d M ( s, t ) ≥ min ν ∈ δ ( t ) K ( d M )( µ, ν ). Since thisinequality holds for all µ ∈ δ ( s ), we have d M ( s, t ) ≥ max µ ∈ δ ( s ) min ν ∈ δ ( t ) K ( d M )( µ, ν ). OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 19
Symmetrically, we can prove d M ( s, t ) ≥ max ν ∈ δ ( t ) min µ ∈ δ ( s ) K ( d M )( µ, ν ). Thus, by defini-tion of Hausdorff lifting, d M ( s, t ) ≥ H ( K ( d M ))( δ ( s ) , δ ( t )). From this we conclude d M ( s, t ) ≥ H ( K ( d M ))( δ ( s ) , δ ( t ))= ∆ ( d M )( s, t ) ( (cid:96) ( s ) = (cid:96) ( t ) and def. ∆ )Finally, we consider the complexity of computing d M . For computing θ , we need to computein turn θ , θ , and θ . Since M ⊆ S × S , computing θ can be done in quadratic time in | S | . The computation of θ requires at most | M | · (cid:80) s ∈ S | δ ( s ) | solutions of a transportationproblem. This can be done in polynomial-time in the size of A . Similarly for θ .The proof of Lemma 6.8 is essentially that of [Fu12, Theorem 3]. Given a nonemptyself-closed relation M w.r.t. d , the above result can be used to obtain a prefix point of ∆ ,namely d M , that improves d towards the search of the least fixed point. The prefix point d M of Lemma 6.8 is obtained from d by subtracting a suitable value θ > M : d M ( s, t ) = (cid:40) d ( s, t ) − θ if ( s, t ) ∈ M ,d ( s, t ) if ( s, t ) / ∈ M .
The value of θ that gives us the smallest prefix point defined as above, is maximal valuesatisfying the following inequalities θ ≤ d ( s, t ) − min ν (cid:48) ∈ δ ( t ) K ( d )( µ, ν (cid:48) ) for all ( s, t ) ∈ M and µ ∈ δ ( s ), θ ≤ d ( s, t ) − min µ (cid:48) ∈ δ ( s ) K ( d )( µ (cid:48) , ν ) for all ( s, t ) ∈ M and ν ∈ δ ( t ), θ ≤ d ( s, t ) for all ( s, t ) ∈ M .
The fact that d M is a prefix point follows by the fact that M is a self-closed relation.The following lemma provides us with a termination condition for the simple policyiteration algorithm to compute d . Indeed, according to it, if d is a fixed point of ∆ , wecan assert that d is equal to bisimilarity distance d by simply checking that the maximalself-closed relation w.r.t. d is empty. Lemma 6.9.
Let d = ∆ ( d ) . If ≈ d = ∅ , then d = d .Proof. Let d = ∆ ( d ). We proceed by contraposition. Assume that d (cid:54) = d . We define anon-empty self-closed relation M w.r.t. d as follows. m = max s,t ∈ S d ( s, t ) − d ( s, t ) , M = { ( s, t ) ∈ S × S | d ( s, t ) − d ( s, t ) = m } . Clearly, m > M (cid:54) = ∅ because d (cid:54) = d .Let ( s, t ) ∈ M . We prove that the three conditions of Definition 6.7 hold true.(1) d ( s, t ) > < m = d ( s, t ) − d ( s, t ) ≤ d ( s, t ). Now we prove that (cid:96) ( s ) = (cid:96) ( t ).Towards a contradiction, assume (cid:96) ( s ) (cid:54) = (cid:96) ( t ). Then, the following inequalities hold0 < m = d ( s, t ) − d ( s, t ) = ∆ ( d )( s, t ) − ∆ ( d )( s, t ) = 1 − , leading to the contradiction that 0 < (2) Let µ ∈ δ ( s ) such that d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν ). Then we have d ( s, t ) = ∆ ( d )( s, t ) (by def. d )= H ( K ( d ))( δ ( s ) , δ ( t )) (by (cid:96) ( s ) = (cid:96) ( t )) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) (by def. H )Let ν ∗ ∈ δ ( t ), ω ∈ Ω( µ, ν ∗ ) such thatmin ν ∈ δ ( t ) K ( d )( µ, ν ) = K ( d )( µ, ν ∗ ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) . (6.5)Then, the following inequalities hold K ( d )( µ, ν ∗ ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (by (6.5))= (cid:80) u,v ∈ S (cid:0) d ( u, v ) − ( d ( u, v ) − d ( u, v )) (cid:1) · ω ( u, v ) ≥ (cid:80) u,v ∈ S ( d ( u, v ) − m ) · ω ( u, v ) (def. m )= (cid:0) (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (cid:1) − m ( ω ∈ D ( S × S )) ≥ K ( d )( µ, ν ∗ ) − m (def. K and ω ∈ Ω( µ, ν ∗ ))Thus, we have d ( s, t ) ≤ K ( d )( µ, ν ∗ ) ( d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν )) ≤ K ( d )( µ, ν ∗ ) + m ( K ( d )( µ, ν ∗ ) ≥ K ( d )( µ, ν ∗ ) − m ) ≤ d ( s, t ) + m ( d ( s, t ) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) and def. ν ∗ )= d ( s, t ) + ( d ( s, t ) − d ( s, t )) (def. m and ( s, t ) ∈ M )= d ( s, t )Therefore, all the above inequalities are, in fact, equalities. Hence, d ( s, t ) = K ( d )( µ, ν ∗ ) and d ( s, t ) = K ( d )( µ, ν ∗ ) . (6.6)We conclude by proving that ω satisfies the following d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M .
This can be observed as follows (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) ≥ d ( s, t ) ( ω ∈ Ω( µ, ν ∗ ) and (6.6))= d ( s, t ) + m (( s, t ) ∈ M and def. m )= (cid:0) (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (cid:1) + m (by (6.6) and (6.5))= (cid:80) u,v ∈ S ( d ( u, v ) + m ) · ω ( u, v ) ( ω ∈ D ( S × S )) ≥ (cid:80) u,v ∈ S (cid:0) d ( u, v ) + ( d ( u, v ) − d ( u, v )) (cid:1) · ω ( u, v ) (def. m )= (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )Hence, the above are in fact equalities and in particular d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ).Consider now the following inequalities m = d ( s, t ) − d ( s, t ) (( s, t ) ∈ M )= d ( s, t ) − (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (by (6.6) and (6.5))= (cid:80) u,v ∈ S (cid:0) d ( u, v ) − d ( u, v ) (cid:1) · ω ( u, v ) ( d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )) OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 21
Algorithm 2:
Simple policy iteration algorithm computing d . Initialise C = ( f, ρ ) as an arbitrary vertex coupling structure for A isMin ← false while ¬ isMin do while ∃ ( s, t ) . ∆ ( γ C )( s, t ) < γ C ( s, t ) do R ← h ( γ C )( s, t ) C ← (cid:0) k ( γ C ) , ρ [( s, t ) /R ] (cid:1) /* update coupling structure */ end Let M ← ≈ γ C /* note that γ C = ∆ ( γ C ) */ if M = ∅ then isMin ← true /* γ C = d */ else Compute d = ( γ C ) M as in Lemma 6.8 Re-initialise C as a vertex coupling structure s.t. Γ C ( d ) = ∆ ( d ) end end return γ C Since d ( u, v ) − d ( u, v ) ≤ m for all u, v ∈ S , the above equalities imply that whenever ω ( u, v ) > d ( u, v ) − d ( u, v ) = m . Therefore supp ( ω ) ⊆ M .(3) Can be argued symmetrically to the previous case.Therefore, M is a nonempty self-closed relation with respect to d .Algorithm 2 extends the procedure described in Section 6.1 by encapsulating the policyiteration update (lines 4–7) into an outer-loop (lines 3–15) that is responsible to check whetherthe fixed point γ C i returned is the minimal one. According to Lemma 5.4, ∆ ( γ C i ) (cid:118) γ C i .Hence, when we reach line 8, we have that ∆ ( γ C i ) = γ C i . Therefore, by Lemmas 6.8 and 6.9, γ C i = d if and only if M = ≈ γ C i is empty. If M is empty, we set the variable isMin to true (line 10) causing the outer-loop to terminate. Otherwise, we construct d = ( γ C i ) M as inLemma 6.8 (line 12) and re-start the inner-loop from a vertex coupling structure C i +1 suchthat Γ C i +1 ( d ) = ∆ ( d ) (line 13) ( e.g. , by using C i +1 = ( k ( d ) , ρ ) where ρ ( s, t ) = h ( d )( s, t ) forall s, t ∈ S ). As proven in Theorem 6.10, γ C i (cid:61) γ C i +1 . This guarantees a strict improvementof the discrepancy towards the minimal one. Termination of Algorithm 2 is justified bysimilar arguments as for the discounted case. Theorem 6.10.
Algorithm 2 terminates and computes d .Proof. First we prove termination. Recall that { γ C | C vertex coupling structure for A} is finite. Towards a contradiction, assume that Algorithm 2 does not terminate. Let C , C , C , . . . be the infinite sequence of coupling structures updates generated during anon-terminating execution of Algorithm 2. Since the set above is finite, there must be i < j such that γ C i = γ C j . On the contrary, next we prove that the updates of the couplingstructures in Algorithm 2 ensure that for all n ∈ N , γ C n (cid:61) γ C n +1 . Let n ∈ N . We considertwo cases: • Assume ∆ ( γ C n )( s, t ) < γ C n ( s, t ), for some s, t ∈ S . In this case C n +1 is obtained from theupdate performed in line 6, which is exactly the one prescribed by Lemma 6.3. Thereforewe have that γ C n λ (cid:61) γ C n +1 λ . • Assume ∆ ( γ C n )( s, t ) ≥ γ C n ( s, t ) for all s, t ∈ S , i.e., γ C n (cid:118) ∆ ( γ C n ). By Lemma 5.4 γ C n (cid:119) ∆ ( γ C n ), therefore γ C n = ∆ ( γ C n ). In this case C n +1 is constructed as a vertexcoupling structure such that Γ C n +1 ( d ) = ∆ ( d ) where M = ≈ γ C n (cid:54) = ∅ and d = ( γ C n ) M (seeline 13). Then the following inequalities hold γ C n (cid:61) d (cid:119) ∆ ( d ) (Lemma 6.8)= Γ C n +1 ( d ) (by construction) (cid:119) γ C n +1 (by Γ C n +1 ( d ) (cid:118) d and Knaster-Tarski fixed point theorem)This concludes the proof that γ C n (cid:61) γ C n +1 for all n ∈ N .When the execution of Algorithm 2 has reached line 10 we have that γ C = ∆ ( γ C ).Moreover, we have ≈ γ C = ∅ . Therefore, by Lemma 6.9 we have that γ C = d . Fromhere isMin is set to true . This prevents further executions of the body of the outer-loop.Therefore Algorithm 2 reached the return statement with γ C = d .6.4. Experimental Results.
In this section, we evaluate the performance of the simplepolicy iteration algorithms on a collection of randomly generated probabilistic automata.All the algorithms have been implemented in Java and the source code is publicly available .The performance of Algorithm 1 has been compared with an implementation of the value iteration algorithm proposed by Fu [Fu12, Section 4]. This algorithm works as follows.Starting from the bottom element, it iteratively applies ∆ λ to the current distance functiongenerating the increasing chain (cid:118) ∆ λ ( ) (cid:118) ∆ λ ( ) (cid:118) · · · (cid:118) ∆ k − λ ( ) (cid:118) ∆ kλ ( ).For each input instance, the comparison involves the following steps:(1) We run Algorithm 1, storing execution time, the number of solved transportationproblems, and the number of coupling structures generated during the execution (i.e.,the number of times a λ -discrepancy has been computed);(2) Then, on the same instance, we execute the value iteration algorithm until the runningtime exceeds that of step 1. We report the execution time, the number of solvedtransportation problems, and the number of iterations.(3) Finally, we report the error max s,t ∈ S | d λ ( s, t ) − d ( s, t ) | between the distance d λ computedin step 1 and the approximate result d obtained in step 2.This has been done for a collection of automata varying from 10 to 50 states. For each n = 10 , . . . ,
50, we considered 100 randomly generated probabilistic automata, varyingprobabilistic out-degree and nondeterministic out-degree. Table 1 reports the average resultsof the comparison. Our algorithm is able to compute the solution before value iteration canunder-approximate it with an error ranging from 0 .
004 to 0 .
06 which is a non negligibleerror considering that we fixed λ = 0 . , https://bitbucket.org/discoveri/probabilistic-bisimilarity-distances-probabilistic-automata . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 23 n = | S | Simple Policy Iteration Value Iteration Errortime (sec) C time (sec) n = 10 ..
50, nondeterministic out-degree k = 1 ..
3, and probabilistic out-degree p = 2 ..
3. Discount λ = 0 .
8; accuracy 0 . T i m e ( S e c o nd s ) p=2,k=1 p=2,k=2 p=2,k=3 p=3,k=1 p=3,k=2 p=3,k=3 Figure 3: Average performance for the Simple Policy Iteration Algorithm conducted on 100randomly generated automata varying number of states n = 10 ..
50, nondetermin-istic out-degree k = 1 ..
3, and probabilistic out-degree p = 2 ..
3. Discount factor λ = 0 .
8; accuracy 0 . λ -discrepancy ( cf. Remark 5.3) which has O ( n k ) variables and O ( n k )constraints where n and k are the number of states and the nondeterministic out-degree ofthe automaton, respectively.Algorithm 2 extends the simple policy iteration algorithms proposed in [BBLM13,TvB16] for labelled Markov chains. As pointed out in [Tan18], implementations based onthe decision procedure for the existential fragment of the first-order theory of the realsstruggle to handle labelled Markov chains with a fifty states. For probabilistic automata,the algorithms in [CdAMR08, CdAMR10] suffer from the same problem. The performanceof Algorithm 2 is comparable to that of Algorithm 1 ( cf. Table 2). Despite the fact that thesimple policy algorithm is not guaranteed to be sound when the discount factor equals one, n = | S | time (sec) C n = 10 ..
50, nondeterministic out-degree k = 1 .. p = 2 ..
3. Discount λ = 1; accuracy 0 . Relation with Probabilistic Model Checking
In this section we show how the probabilistic bisimilarity distance of Deng et al. relatesto the problem of model checking ω -regular specifications against probabilistic automata,where the nondeterministic choices are resolved by randomized schedulers.Probabilistic automata are used for the verification of concurrent probabilistic systems,where the choice of how to interleave the executions of the parallel components is modelledby means of nondeterminism in the choice of the next transition to be taken. Technically, anexecution of a probabilistic automaton A = ( S, L, → , (cid:96) ) is an infinite sequence s s . . . ∈ S ω of labelled states obtained by taking a succession of probabilistic transitions s i → µ i such that µ i ( s i +1 ) >
0, for each i ∈ N . The choice of the transition to be taken at each state is resolvedby means of a scheduler. In this paper we consider randomized schedulers , i.e., functions π : S + → D ( S ) mapping a nonempty and finite sequence of states s . . . s n ∈ S + (theexecution history) to a convex combination of distributions of the form (cid:80) s n → µ α µ · µ ∈ D ( S ),for some α µ ∈ [0 ,
1] such that (cid:80) s n → µ α µ = 1. Roughly, a randomized scheduler decides theprobability with which the next transition is chosen given the history of visited states.The combination of a probabilistic automata A with a scheduler π induces a Markovchain on a random variable X = ( X , X , . . . ) ∈ S ω on the measurable space of infinitesequences over S with distributionPr πs ( X = s , . . . , X n = s n ) = s ( s ) · n − (cid:89) i =0 π ( s . . . s i )( s i +1 ) , where x denotes the Dirac distribution concentrated at x . The above describes theprobability of executing the sequence of steps s . . . s n by starting from the state s underthe randomized scheduler π . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 25
The measurable sets of S ω are the elements of the infinite product σ -algebra (2 S ) ω , i.e.,the smallest σ -algebra containing the subsets of the form s . . . s n S ω = { s . . . s n w | w ∈ S ω } (a.k.a., discrete cylinders ), for arbitrary n ∈ N , s i ∈ S and 0 ≤ i ≤ n . Measurable sets arethe subsets of sequences where the probability measure Pr πs is well-defined. For a measurableset H ⊆ S ω , we denote by Pr πs ( X ∈ H ) the probability that an execution starting from s under the scheduler π belongs to H .Rather than measuring the probability of concrete executions over S ω , one is often moreinterested in the probability that certain execution traces satisfy abstract properties overthe measurable space L ω of infinite sequences of labels, representing the sequences of atomicproperties satisfied by concrete executions of the automaton.Formally, for a measurable set E ⊆ L ω , we denote by Pr πs ( (cid:96) ( X ) ∈ E ) the probabilitythat an execution generates a sequence of labels in E , where (cid:96) ( X ) = ( (cid:96) ( X ) , (cid:96) ( X ) , . . . ) ∈ L ω is the random variable induced from X by the labelling function (cid:96) . The σ -algebra of L ω contains all the ω -regular languages expressible over the alphabet L [BK08, Chapter 10].This means that the probability of the runs of A of satisfying ω -regular properties, possiblyexpressed in the form of LTL formulas, can be formally measured by Pr πs , hence allowingthe quantitative analysis of probabilistic automata.The quantitative analysis of a probabilistic automaton A against ω -regular specifications,more commonly known as probabilistic model-checking , amounts to establishing the maximaland minimal probability of satisfying ω -regular properties E ⊆ L ω over infinite sequences oflabels from a starting state s . Formally, this corresponds to computingMax s ( E ) = sup π ∈ Π Pr πs ( (cid:96) ( X ) ∈ E ) and Min s ( E ) = inf π ∈ Π Pr πs ( (cid:96) ( X ) ∈ E )where the infimum and supremum are taken over the set Π of all randomized schedulers.Note that, considering minimal or maximal probabilities corresponds to a worst/best-caseanalysis (see [BK08, Chapter 10] for more details).The following is the main result of the section. It states that the probabilistic bisimilaritydistance bounds the difference between maximal and minimal probability of satisfying anymeasurable linear-time property ( e.g. , ω -regular specifications) on two given initial states. Theorem 7.1.
For all measurable subsets E ⊆ L ω , | Max s ( E ) − Max t ( E ) | ≤ d ( s, t ) and | Min s ( E ) − Min t ( E ) | ≤ d ( s, t ) . The above can be seen as a quantitative generalization of the folklore result that bisimilarstates satisfy the same linear-time properties with the same probability.
Remark 7.2.
The relevance of Theorem 7.1 is not just theoretical, but could possibly leadto significant practical applications. Imagine that the distance d ( s, t ) between some givenstates s and t is small (and known); then, computing Max s ( E ) (resp. Min s ( E )) in the state s may be enough for obtaining a good approximation for the actual value of Max t ( E ) (resp.Min t ( E )) without the need of computing it on the state t . This approach may lead tosavings in the overall model-checking time of t , especially if the executions starting from s have a significant reduced degree of nondeterminism than whose starting from t .The proof of Theorem 7.1 is based on the coupling characterisation of the bisimilaritydistance presented in Theorem 5.7 and the following technical lemma (Lemma 7.3) whichestablishes under which conditions the discrepancy associated with a coupling structure canbe used to bound the variational distance between the probability induced by a probabilistic automaton A under two different schedulers. Specifically, we establish how, from a couplingstructure C , one can retrieve a set-coupling R C ∈ R (Π , Π) of schedulers for A such that foreach pair of schedulers ( π, π (cid:48) ) for A in R C , the difference | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | for any measurable E ⊆ L ω , is bounded by the discrepancy γ C ( s, t ).The definition of R C can be intuitively understood by recalling that γ C ( s, t ) correspondsto the maximal probability of reaching a pair of states with different labels from the state pair( s, t ) in the automaton A C induced from C . Roughly, R C collects all the pairs of schedulersfor A obtained as the left and right projection of a scheduler for A C . How the projections aredefined is technical and the interested reader can found the formal definition in the proof. Lemma 7.3.
For any coupling structure C for A and s, t ∈ S , exists R C ∈ R (Π , Π) such that,for all measurable E ⊆ L ω and ( π, π (cid:48) ) ∈ R C , | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ γ C ( s, t ) .Proof. Fix s, t ∈ S and C = ( f, ρ ) a coupling structure for A . Let A C be the automatonassociated with the coupling structure C .We split the proof in two parts. (Part 1) deals with the definition of the set-coupling R C ∈ R (Π , Π); (Part 2) with proving that | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ γ C ( s, t ), forall pairs of randomized schedulers ( π, π (cid:48) ) ∈ R C .Hereafter, for a nonempty finite sequence σ ∈ S + and a random variable X on S ω , weuse X ≺ σ to denote X ∈ σS ω . Part 1:
Let (
X, Y ) ∈ S ω × S ω be the random variable describing the infinite sequence ofstate pairs along a run of A C . Then, for any two nonempty finite sequences of the samelength σ = σ . . . σ n and τ = τ . . . τ n over S ,Pr π ( s,t ) (( X, Y ) ≺ (cid:104) σ, τ (cid:105) ) = Pr π ( s,t ) ( X ≺ σ, Y ≺ τ )is the probability that, starting from ( s, t ), a run of A C under the scheduler π hasprefix (cid:104) σ, τ (cid:105) = ( σ , τ ) . . . ( σ n , τ n ). The above can be alternatively formulated in terms ofconditional probabilities in the following two ways:Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) = Pr π ( s,t ) ( X ≺ σ ) · Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) , (7.1)Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) = Pr π ( s,t ) ( Y ≺ τ ) · Pr π ( s,t ) ( X ≺ σ | Y ≺ τ ) . (7.2)Given a scheduler π for A C , we define the maps π S , π T : S + → D ( S ) as follows , forarbitrary nonempty sequences σ, τ over Sπ S ( σ )( u ) = (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) , (7.3) π T ( τ )( u ) = (cid:88) σ ∈ S | τ | (cid:16) Pr π ( s,t ) ( X ≺ σ | Y ≺ τ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) . (7.4)We call π S and π T , the left and right projections of π .Intuitively, π S ( σ )( u ) describes the probability that under the scheduler π a run of A C with initial state ( s, t ) has a prefix of the form (cid:104) σu, τ (cid:48) (cid:105) , for some τ (cid:48) ∈ S | σ | +1 ; symmetrically, π T ( τ )( u ) is the probability that the prefix is of the form (cid:104) σ (cid:48) , τ u (cid:105) , for some σ (cid:48) ∈ S | τ | +1 .Next, we prove that π S and π T are well-defined schedulers for A . We provide the proofonly for π S , as it is similar for π T . We need to show that π S ( σ ) is a convex combination ofthe form (cid:80) µ ∈ δ ( σ n ) α µ · µ , for some α µ ∈ [0 ,
1] such that (cid:80) µ ∈ δ ( σ n ) α µ = 1. By hypothesis We assume that Pr π ( s,t ) ( E | F ) = 0, whenever Pr π ( s,t ) ( F ) = 0 for any two measurable events E, F . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 27 that π is a scheduler for A C , we have that π ( (cid:104) σ, τ (cid:105) ) = (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · f ( µ, ν ), forsome ξ µ,ν ∈ [0 , (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν = 1. Without loss of generality, we canassume that ξ µ,ν = 0, whenever ( µ, ν ) / ∈ ρ ( σ n , τ n ). Let κ σ,τ = Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ),then π S ( σ )( u ) = (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · f ( µ, ν )( u, v ) (cid:17) ( π scheduler)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) = (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · µ ( u ) (cid:17) ( f measure-coupling map)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ µ,ν · µ ( u ) (cid:17) ( ρ set-coupling map)= (cid:88) µ ∈ δ ( σ n ) (cid:16) (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ν ∈ δ ( τ n ) ξ µ,ν (cid:17) · µ ( u )By letting α µ = (cid:80) τ ∈ S | σ | κ σ,τ (cid:80) ν ∈ δ ( τ n ) ξ µ,ν , we get π S ( σ ) = (cid:80) µ ∈ δ ( σ n ) α µ · µ in the desiredform. Next we show that this is a convex combination, i.e., (cid:80) µ ∈ δ ( σ n ) α µ = 1. (cid:88) µ ∈ δ ( σ n ) α µ = (cid:88) µ ∈ δ ( σ n ) (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ν ∈ δ ( τ n ) ξ µ,ν (def. α µ )= (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ µ,ν = (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν ( ρ set-coupling map)= (cid:88) τ ∈ S | σ | κ σ,τ ( (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν = 1)= (cid:88) τ ∈ S | σ | Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (def. κ σ,τ )= 1 (probability)So, π S and π T are well-defined schedulers for A . Given the above, we define therelation R C ⊆ Π × Π on schedulers for A by R C = { ( π S , π T ) | π scheduler on A C } . To better understand the definition of R C , recall that A C can be interpreted as theautomaton describing the concurrent execution of two copies of A synchronised accordingto C . Then, π S and π T can be interpreted as the schedulers on obtained from π , byrespectively taking the left and right projections of the executions of A C as computationson A . The relation R C is given as the collection of these pair of projections. Next we prove that R C is a set-coupling for (Π , Π), that is { π | ∃ π ∈ Π . ( π , π ) ∈ R C } = Π and { π | ∃ π ∈ Π . ( π , π ) ∈ R C } = Π . By definition of R C , this is equivalent to prove that for an arbitrary pair of schedulers π S , π T for A we can find a scheduler π for A C such that (7.3), (7.4) hold (hence,( π S , π T ) ∈ R C ).Let σ = σ . . . σ n and τ = τ . . . τ n be a pair of nonempty finite sequences of the samelength over S , and assume π S ( σ ) = (cid:80) µ ∈ δ ( σ n ) α σµ · µ and π T ( τ ) = (cid:80) ν ∈ δ ( τ n ) β τν · ν , for some α σµ , β τν ∈ [0 ,
1] such that (cid:80) µ ∈ δ ( σ n ) α σµ = 1 and (cid:80) ν ∈ δ ( τ n ) β τν = 1. We define π ( (cid:104) σ, τ (cid:105) ) = (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν · f ( µ, ν ) where ξ σ,τµ,ν = (cid:40) α σµ · β τν if ( µ, ν ) ∈ ρ ( σ n , τ n )0 otherwise . By the fact that ρ ( σ n , τ n ) is a set-coupling in R ( δ ( σ n ) , δ ( τ n )) and the definition of ξ σ,τµ,ν , it is easy to see that for all µ ∈ δ ( σ n ) and ν ∈ δ ( τ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν = α σµ and (cid:88) µ ∈ δ ( σ n ) ξ σ,τµ,ν = β τν . (7.5)Next we show that (7.3) holds. Let κ σ,τ = Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ), then π S ( σ )( u )= (cid:88) τ ∈ S | σ | κ σ,τ π S ( σ )( u ) (convex combination)= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) α σµ · µ ( u ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν · µ ( u ) (cid:17) (by (7.5))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) ( f ( µ, ν ) ∈ Ω( µ, ν ))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) ( ρ set-coupl. map)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν · f ( µ, ν )( u, v ) (cid:17) = (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) (def. κ σ,τ and π )Equation (7.4) is proven symmetrically. Part 2:
We prove Pr π ( s,t ) ∈ Ω(Pr π S s , Pr π T t ) first. Showing the marginal conditions correspondsto prove that, for all nonempty sequences σ = σ . . . σ n and τ = τ . . . τ n over S ,Pr π S s ( X ≺ σ ) = Pr π ( s,t ) ( X ≺ σ ) and Pr π T t ( Y ≺ τ ) = Pr π ( s,t ) ( Y ≺ τ ) . We prove only the equality on the left, as the other is similar. We proceed by inductionon n ≥ OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 29 • Base case, n = 0. Then σ = σ ∈ S andPr π S s ( X ≺ σ ) = s ( σ ) (def. Pr π S s )= (cid:88) τ ∈ S s ( σ ) · t ( τ ) (convex combination)= (cid:88) τ ∈ S ( s,t ) ( σ , τ ) (def. ( s,t ) )= (cid:88) τ ∈ S Pr π ( s,t ) ( X ≺ σ , Y ≺ τ ) (def. Pr π ( s,t ) )= Pr π ( s,t ) ( X ≺ σ ) . (additivity) • Inductive step, n ≥
0. Let σ = σ . . . σ n and s (cid:48) ∈ S , thenPr π S s ( X ≺ σs (cid:48) )= Pr π S s ( X ≺ σ ) · π S ( σ )( s (cid:48) ) (def. Pr π S s )= Pr π ( s,t ) ( X ≺ σ ) · π S ( σ )( s (cid:48) ) (inductive hp)= Pr π ( s,t ) ( X ≺ σ ) · (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) · (cid:88) t (cid:48) ∈ S π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σ ) · Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) · π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) )= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) · π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) ) (by (7.1))= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σs (cid:48) , Y ≺ τ t (cid:48) ) (def. Pr π ( s,t ) )= Pr π ( s,t ) ( X ≺ σs (cid:48) ) . (additivity)The right-marginal condition is proven symmetrically.Note that the discrepancy γ C ( s, t ) is the maximal probability of reaching a state pair( u, v ) such that (cid:96) ( u ) (cid:54) = (cid:96) ( v ) by starting from the state pair ( s, t ) in A C . That is, γ C ( s, t ) = sup π Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) , (7.6)where π ranges over all schedulers for A C . Thus, from Pr π ( s,t ) ∈ Ω(Pr π S s , Pr π T t ) we havePr π S s ( E ) = Pr π ( s,t ) ( (cid:96) ( X ) ∈ E ) ≥ Pr π ( s,t ) ( (cid:96) ( X ) = (cid:96) ( Y ) , (cid:96) ( Y ) ∈ E )= 1 − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y ) ∪ (cid:96) ( Y ) (cid:54)∈ E ) ≥ − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) − Pr π ( s,t ) ( (cid:96) ( Y ) (cid:54)∈ E )= Pr π ( s,t ) ( (cid:96) ( Y ) ∈ E ) − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) ≥ Pr π ( s,t ) ( (cid:96) ( Y ) ∈ E ) − γ C ( s, t )= Pr π T ( s,t ) ( E ) − γ C ( s, t ) . From the above, we conclude that | Pr π S s ( (cid:96) ( X ) ∈ E ) − Pr π T t ( (cid:96) ( Y ) ∈ E ) | ≤ γ C ( s, t ). The proofs follows immediately by combining Part 1 and 2.Given Lemma 7.3 it is easy to establish the result stated in Theorem 7.1.
Proof of Theorem 7.1.
Let d R ( x, y ) = | x − y | denote the Euclidean distance on the real lineand define K = { Pr πs ( (cid:96) ( X ) ∈ E ) | π ∈ Π } and H = { Pr πt ( (cid:96) ( X ) ∈ E ) | π ∈ Π } .Then, by [M´em11, Lemma 3.2] H ( d R )( K, H ) ≥ max {| sup K − sup H | , | inf K − inf H |} = max {| Max s ( E ) − Max t ( E ) | , | Min s ( E ) − Min t ( E ) |} . Next we show that d ( s, t ) ≥ H ( d R )( K, H ). H ( d R )( K, H )= inf (cid:40) sup ( π,π (cid:48) ) ∈ R | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | (cid:12)(cid:12)(cid:12) R ∈ R (Π , Π) (cid:41) (Theorem 2.1) ≤ inf { γ C | C coupling structure for A } (Lemma 7.3)= d ( s, t ) . (Theorem 5.7)Therefore, | Max s ( E ) − Max t ( E ) | ≤ d ( s, t ) and | Min s ( E ) − Min t ( E ) | ≤ d ( s, t ).Another consequence of Lemma 7.3 is that the bisimilarity distance provides an upperbound of the Hausdorff lifting of the variational distance between sets of distributionsinduced by the Markov chains obtained by ranging over all possible schedulers. In thetheorem we use T V to denote the total variation distance between probability measures,defined as
T V ( µ, ν ) = sup E | µ ( E ) − ν ( E ) | , where E ranges over all measurable subsets. Theorem 7.4. H ( T V )( { Pr πs ( (cid:96) ( X ) ∈ · ) | π ∈ Π } , { Pr πt ( (cid:96) ( X ) ∈ · ) | π ∈ Π } ) ≤ d ( s, t ) .Proof. H ( T V )( { Pr πs ( (cid:96) ( X ) ∈ · ) | π ∈ Π } , { Pr πt ( (cid:96) ( X ) ∈ · ) | π ∈ Π } ) == inf (cid:40) sup ( π,π (cid:48) ) ∈ R T V (Pr πs ( (cid:96) ( X ) ∈ · ) , Pr π (cid:48) t ( (cid:96) ( X ) ∈ · )) | R ∈ R (Π , Π) (cid:41) (Theorem 2.1)= inf (cid:40) sup ( π,π (cid:48) ) ∈ R sup E | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | (cid:12)(cid:12)(cid:12) R ∈ R (Π , Π) (cid:41) (def. T V ) ≤ inf { γ C | C coupling structure for A } (Lemma 7.3)= d ( s, t ) . (Theorem 5.7)Theorem 7.4 can be alternatively stated as follows. For any scheduler π there existsa scheduler π (cid:48) such that | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ d ( s, t ), for all measurablesubsets E ⊆ L ω . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 31 Conclusion and Future Work
We presented a novel characterization of the probabilistic bisimilarity distance of Deng etal. [DCPP06] as the solution of a simple stochastic game. Starting from it, we designedalgorithms for computing the distances based on Condon’s simple policy iteration algorithm.The correctness of Condon’s approach relies on the assumption that the input game isstopping. This may not be the case for our probabilistic bisimilarity games when thediscount factor is one. We overcame this problem by means of an improved terminationcondition based on the notion of self-closed relation due to Fu [Fu12].As in [TvB16], our simple policy iteration algorithm has exponential worst-case timecomplexity. Nevertheless, experiments show that our method can compete in practice withthe value iteration algorithm by Fu [Fu12] which has theoretical polynomial-time complexityfor λ <
1. To the best of our knowledge, our algorithm is the first practical solution forcomputing the bisimilarity distance when λ = 1, performing orders of magnitude fasterthan the existing solutions based on the existential fragment of the first-order theory of thereals [CdAMR08, CdAMR10, CHL07].As future work, we plan to improve upon the current implementation in the lineof [TvB18a], by exploiting the fact that bisimilar states and probabilistic distance one [TvB18b]can be efficiently pre-computed before starting the policy iteration. We believe that thiswould yield a significant cut down in the time required to compute the discrepancy at eachiteration which turned out to be the bottleneck of our algorithms.More efficient algorithms might lead to the speedup of verification tools for concurrentprobabilistic systems, as behavioral distances relate to the satisfiability of logical properties.For the case of labelled Markov chains, in [CvBW12, BBLM15] the variational differencebetween two states with respect to their probability of satisfying linear-time properties (eg.,LTL formulas) is shown to be bound by the (undiscounted) probabilistic bisimilarity distance.In Section 7 we showed that a similar result holds for the case of probabilistic automatawith additional subtleties that arise by the need of handling the nondeterminism. In light ofthis relation it would be interesting to develop approximated techniques to cut down theoverall model-checking time of probabilistic automata as briefly discussed in Remark 7.2.We also plan to extend the work on approximated minimization [BBLM17, BBLM18]to the case of probabilistic automata and explore the possible relation between the prob-abilistic bisimilarity distance with more expressive logics for concurrent probabilistic sys-tems [CdAMR08, CdAMR10, Mio12]. References [BBL +
19] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare, Qiyi Tang, and Franck vanBreugel. Computing Probabilistic Bisimilarity Distances for Probabilistic Automata. In , volume 140 of
LIPIcs , pages9:1–9:17. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik, 2019.[BBLM13] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. On-the-Fly Exact Computa-tion of Bisimilarity Distances. In , volume 7795 of
LNCS , pages 1–15, 2013.[BBLM15] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. Converging from Branchingto Linear Metrics on Markov Chains. In , volume 9399 of
Lecture Notes in Computer Science , pages 349–367.Springer, 2015. [BBLM17] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the Metric-BasedApproximate Minimization of Markov Chains. In , volume 80 of
LIPIcs , pages 104:1–104:14. SchlossDagstuhl - Leibniz-Zentrum fuer Informatik, 2017.[BBLM18] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the metric-basedapproximate minimization of Markov Chains.
J. Log. Algebr. Meth. Program. , 100:36–56, 2018.[BCD +
11] Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovi´c, TimKing, Andrew Reynolds, and Cesare Tinelli. CVC4. In Ganesh Gopalakrishnan and Shaz Qadeer,editors, , volume 6806of
Lecture Notes in Computer Science , pages 171–177. Springer, 2011.[BK08] Christel Baier and Joost-Pieter Katoen.
Principles of Model Checking . MIT Press, 2008.[CdAMR08] Krishnendu Chatterjee, Luca de Alfaro, Rupak Majumdar, and Vishwanath Raman. Algorithmsfor Game Metrics. In
IARCS Annual Conference on Foundations of Software Technology andTheoretical Computer Science, FSTTCS 2008 , volume 2 of
LIPIcs , pages 107–118. SchlossDagstuhl - Leibniz-Zentrum fuer Informatik, 2008.[CdAMR10] Krishnendu Chatterjee, Luca de Alfaro, Rupak Majumdar, and Vishwanath Raman. Algorithmsfor Game Metrics (Full Version).
Logical Methods in Computer Science , 6(3), 2010.[CHL07] Taolue Chen, Tingting Han, and Jian Lu. On Behavioral Metric for Probabilistic Systems:Definition and Approximation Algorithm. In , pages 21–25. IEEE Computer Society, 2007.[Con90] Anne Condon. On Algorithms for Simple Stochastic Games. In
Advances In ComputationalComplexity Theory , volume 13 of
DIMACS Series in Discrete Mathematics and TheoreticalComputer Science , pages 51–72. DIMACS/AMS, 1990.[Con92] Anne Condon. The Complexity of Stochastic Games.
Inf. Comput. , 96(2):203–224, 1992.[CvBW12] Di Chen, Franck van Breugel, and James Worrell. On the Complexity of Computing Proba-bilistic Bisimilarity. In , volume 7213 of
Lecture Notes in Computer Science ,pages 437–451. Springer, 2012.[DCPP06] Yuxin Deng, Tom Chothia, Catuscia Palamidessi, and Jun Pang. Metrics for Action-labelledQuantitative Transition Systems.
Electr. Notes Theor. Comput. Sci. , 153(2):79–96, 2006.[DGJP04] Josee Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics forLabelled Markov Processes.
Theor. Comput. Sci. , 318(3):323–354, 2004.[DLT08] Jos´ee Desharnais, Fran¸cois Laviolette, and Mathieu Tracol. Approximate Analysis of ProbabilisticProcesses: Logic, Simulation and Games. In , pages 264–273. IEEE Computer Society, 2008.[FKP17] Nathana¨el Fijalkow, Bartek Klin, and Prakash Panangaden. Expressiveness of ProbabilisticModal Logics, Revisited. In , volume 80 of
LIPIcs , pages 105:1–105:12. Schloss Dagstuhl -Leibniz-Zentrum fuer Informatik, 2017.[Fu12] Hongfei Fu. Computing Game Metrics on Markov Decision Processes. In , volume 7392 of
LectureNotes in Computer Science , pages 227–238. Springer, 2012.[Fu14] Hongfei Fu.
Verifying Probabilistic Systems: New Algorithms and Complexity Results . PhDthesis, RWTH Aachen, Aachen, Germany, Nov 2014.[GJS90] Alessandro Giacalone, Chi-Chang Jou, and Scott A. Smolka. Algebraic Reasoning for Proba-bilistic Concurrent Systems. In
Proceedings of the IFIP WG 2.2/2.3 Working Conference onProgramming Concepts and Methods , pages 443–458. North-Holland, 1990.[Hau14] Felix Hausdorff.
Grundz¨uge der Mengenlehre . Verlag Von Veit & Comp, Leipzig, 1914.[JL91] Bengt Jonsson and Kim G. Larsen. Specification and Refinement of Probabilistic Processes.In , pages 266–277. IEEEComputer Society, 1991.[Jub05] Brendan Juba. On the Hardness of Simple Stochastic Games. Master’s thesis, Carnegie MellonUniversity, Pittsburgh, PA, USA, May 2005.[Kan42] Leonid Vitalevich Kantorovich. On the transfer of masses (in Russian).
Doklady Akademii Nauk ,5(5-6):1–4, 1942. Translated in Management Science, 1958.
OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 33 [KM18] Barbara K¨onig and Christina Mika-Michalski. (Metric) Bisimulation Games and Real-ValuedModal Logics for Coalgebras. In , volume 118 of
LIPIcs , pages 37:1–37:17. Schloss Dagstuhl - Leibniz-Zentrum fuerInformatik, 2018.[KS95] Peter Kleinschmidt and Heinz Schannath. A Strongly Polynomial Algorithm for the Transporta-tion Problem.
Math. Program. , 68:1–13, 1995.[LL69] Thomas Liggett and Steven A. Lippman. Stochastic Games with Perfect Information and TimeAverage Payoff.
SIAM Review , 11(4):604–607, 1969.[M´em11] Facundo M´emoli. Gromov-Wasserstein Distances and the Metric Approach to Object Matching.
Foundations of Computational Mathematics , 11(4):417–487, 2011.[Mio12] Matteo Mio. On the Equivalence of Game and Denotational Semantics for the Probabilistic µ -calculus. Logical Methods in Computer Science , 8(2), 2012.[Orl85] James B. Orlin. On the Simplex Algorithm for Networks and Generalized Networks. In
Mathe-matical Programming Essays in Honor of George B. Dantzig Part I , pages 166–178. SpringerBerlin Heidelberg, 1985.[PC19] Gabriel Peyr´e and Marco Cuturi. Computational Optimal Transport.
Foundations and Trendsin Machine Learning , 11(5-6):355–607, 2019.[Put94] Martin L. Puterman.
Markov Decision Processes: Discrete Stochastic Dynamic Programming .John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.[Sch99] Alexander Schrijver.
Theory of Linear and Integer Programming . Wiley-Interscience series inDiscrete Mathematics and Optimization. Wiley, 1999.[Seg95] Roberto Segala.
Modeling and Verification of Randomized Distributed Real-time Systems . PhDthesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1995.[Sha53] Lloyd S. Shapley. Stochastic Games.
Proceedings of the National Academy of Sciences ,39(10):1095–1100, 1953.[SL94] Roberto Segala and Nancy A. Lynch. Probabilistic Simulations for Probabilistic Processes. In , volume 836 of
LectureNotes in Computer Science , pages 481–496. Springer, 1994.[Str89] James K. Strayer.
Linear Programming and its Applications . Undergraduate Texts in Mathe-matics. Springer-Verlag, New York, NY, USA, 1989.[Tan18] Qiyi Tang.
Computing Probabilistic Bisimilarity Distances . PhD thesis, York University, Toronto,Canada, August 2018.[TvB16] Qiyi Tang and Franck van Breugel. Computing Probabilistic Bisimilarity Distances via PolicyIteration. In , volume 59of
LIPIcs , pages 22:1–22:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.[TvB18a] Qiyi Tang and Franck van Breugel. Deciding Probabilistic Bisimilarity Distance One for LabelledMarkov Chains. In ,volume 10981 of
Lecture Notes in Computer Science , pages 681–699. Springer, 2018.[TvB18b] Qiyi Tang and Franck van Breugel. Deciding Probabilistic Bisimilarity Distance One for Prob-abilistic Automata. In Sven Schewe and Lijun Zhang, editors, , volume 118 of
Leibniz International Proceedings inInformatics (LIPIcs) , pages 9:1–9:17. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018.[vB12] Franck van Breugel. On Behavioural Pseudometrics and Closure Ordinals.
Inf. Process. Lett. ,112(19):715–718, 2012.[vBSW08] Franck van Breugel, Babita Sharma, and James Worrell. Approximating a Behavioural Pseudo-metric without Discount for Probabilistic Systems.
Logical Methods in Computer Science , 4(2),2008.[vBW14] Franck van Breugel and James Worrell. The Complexity of Computing a Bisimilarity Pseudo-metric on Probabilistic Automata. In
Horizons of the Mind. A Tribute to Prakash Panangaden—Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday , volume 8464of
Lecture Notes in Computer Science , pages 191–213. Springer, 2014.[Vil08] C´edric Villani.
Optimal Transport: Old and New . Grundlehren der mathematischen Wis-senschaften. Springer, 2008.[ZP96] Uri Zwick and Mike Paterson. The complexity of mean payoff games on graphs.