[PDF] Computing Probabilistic Bisimilarity Distances for Probabilistic Automata

Abstract

The probabilistic bisimilarity distance of Deng et al. has been proposed as a robust quantitative generalization of Segala and Lynch's probabilistic bisimilarity for probabilistic automata. In this paper, we present a characterization of the bisimilarity distance as the solution of a simple stochastic game. The characterization gives us an algorithm to compute the distances by applying Condon's simple policy iteration on these games. The correctness of Condon's approach, however, relies on the assumption that the games are stopping. Our games may be non-stopping in general, yet we are able to prove termination for this extended class of games. Already other algorithms have been proposed in the literature to compute these distances, with complexity in UP∩coUP and \textbf{PPAD}. Despite the theoretical relevance, these algorithms are inefficient in practice. To the best of our knowledge, our algorithm is the first practical solution. The characterization of the probabilistic bisimilarity distance mentioned above crucially uses a dual presentation of the Hausdorff distance due to Mémoli. As an additional contribution, in this paper we show that Mémoli's result can be used also to prove that the bisimilarity distance bounds the difference in the maximal (or minimal) probability of two states to satisfying arbitrary ω -regular properties, expressed, eg., as LTL formulas.

Full PDF

CCOMPUTING PROBABILISTIC BISIMILARITY DISTANCESFOR PROBABILISTIC AUTOMATA

GIORGIO BACCI, GIOVANNI BACCI, KIM G. LARSEN, RADU MARDARE, QIYI TANG,AND FRANCK VAN BREUGELDept. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer Science, Aalborg University, Denmark e-mail address : [email protected]. of Computer and Information Sciences, University of Strathclyde, Glasgow, UK e-mail address : [email protected]. of Computer Science, Oxford University, UK e-mail address : [email protected] Group, Dept. of Electrical Engineering and Computer Science, York University, Canada e-mail address : [email protected]

Abstract.

The probabilistic bisimilarity distance of Deng et al. has been proposed asa robust quantitative generalization of Segala and Lynch’s probabilistic bisimilarity forprobabilistic automata. In this paper, we present a characterization of the bisimilaritydistance as the solution of a simple stochastic game. The characterization gives us analgorithm to compute the distances by applying Condon’s simple policy iteration on thesegames. The correctness of Condon’s approach, however, relies on the assumption that thegames are stopping . Our games may be non-stopping in general, yet we are able to provetermination for this extended class of games. Already other algorithms have been proposedin the literature to compute these distances, with complexity in UP ∩ coUP and PPAD .Despite the theoretical relevance, these algorithms are ineﬃcient in practice. To the bestof our knowledge, our algorithm is the ﬁrst practical solution.The characterization of the probabilistic bisimilarity distance mentioned above cruciallyuses a dual presentation of the Hausdorﬀ distance due to M´emoli. As an additionalcontribution, in this paper we show that M´emoli’s result can be used also to prove that thebisimilarity distance bounds the diﬀerence in the maximal (or minimal) probability of twostates to satisfying arbitrary ω -regular properties, expressed, eg., as LTL formulas. Key words and phrases: probabilistic automata, behavioural pseudometrics, stochastic games. ∗ This paper is an extended version of an earlier conference paper [BBL +

19] presented at CONCUR 2019.Giovanni Bacci and Kim G. Larsen are supported by the ERC-Project LASSO.Franck van Breugel is supported by Natural Sciences and Engineering Research Council of Canada.

Preprint submitted toLogical Methods in Computer Science c (cid:13)

G. Bacci, G. Bacci, K. G. Larsen, R. Mardare, Q. Tang, and F. van Breugel CC (cid:13) Creative Commons a r X i v : . [ c s . F L ] M a y G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL Introduction

In [GJS90], Giacalone et al. observed that for reasoning about the behaviour of probabilisticsystems, rather than equivalences, a notion of distance is more reasonable in practice since itpermits to capture the degree of diﬀerence between two states. This observation motivatedthe study of behavioural pseudometrics, that generalize behavioural equivalences in thesense that, when the distance is zero then the two states are behaviourally equivalent.The systems we consider in this paper are labelled probabilistic automata . This modelwas introduced by Segala [Seg95] to capture both nondeterminism (hence, concurrency) andprobabilistic behaviours. The labels on states are used to express that certain properties ofinterest hold in particular states.In Figure 1 we consider an example of a probabilistic automaton describing two gamblers, f and b , deciding on which team to bet in a football match. Typically the two gamblersknow on which team to bet, but occasionally they prefer to toss a coin to make a decision.This is represented by the three probabilistic transitions in the state f . The ﬁrst two take f to state h (head) or t (tail) with probability one, the last takes f to states h and t withprobability each. The diﬀerence between f and b is that the former uses a fair coin whilethe latter uses a biased coin landing on heads with slightly higher probability. Once thedecision is taken, it is not changed anymore. This is seen on states h and t which have asingle probabilistic transition taking the state to itself with probability one. The states h and t have distinct labels, here represented by colours.A behavioural pseudometric for probabilistic automata capturing this diﬀerence isthe probabilistic bisimilarity distance by Deng et al. [DCPP06], introduced as a robustgeneralization of Segala and Lynch’s probabilistic bisimilarity [SL94]. The key ingredientsof this pseudometric are the Hausdorﬀ metric [Hau14] and the Kantorovich metric [Kan42],respectively used to capture nondeterministic and probabilistic behaviour. In the exampleabove, the behaviours of the states h and t are very diﬀerent since their labels are diﬀerent.As a result, their probabilistic bisimilarity distance is one. On the other hand, the behavioursof the states f and b are very similar, which is reﬂected by the fact that their probabilisticbisimilarity distance is .The ﬁrst attempt to compute the above distance is due to Chen et al. [CHL07], whoproposed a doubly exponential-time procedure to approximate the distances up to anydegree of accuracy. The complexity was later improved to PSPACE by Chattarjee etal. [CdAMR08, CdAMR10]. Their solutions exploit the decision procedure for the existential f t bh

111 11 1Figure 1: A probabilistic automaton describing two gablers.

OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 3 fragment of the ﬁrst-order theory of the reals. It is worth noting that [CHL07, CdAMR08]consider the pseudometric that does not discount the future (a.k.a. undiscounted distance)which entails additional algorithmic challenges. Later, Fu [Fu12] showed that the distanceshave rational values and that computing the discounted distance can be done in polynomialtime by using a value-iteration procedure in combination with the continued fractionalgorithm [Sch99, Section 6]. As for the undiscounted distance, he showed that the thresholdproblem, i.e., deciding whether the distance is smaller than a given rational, is in NP ∩ coNP .The same proof can be adapted to show that the decision problem is in UP ∩ coUP [Fu14],where UP is the subclass of NP -problems with a unique accepting computation. VanBreugel and Worrell [vBW14] have later shown that the problem is in PPAD , which isshort for polynomial parity argument on directed graphs. Notably, their proof exploits acharacterization of the distance as a simple stochastic game. The above algorithms werepresented with the purpose of understanding the complexity of computing bisimilaritydistances and, to the best of our knowledge, they have never been implemented. Theirimplementation would involve either an enumeration of possibly exponentially many ﬁxed-points [Fu14], or the use of SMT solvers over the existential fragment of the ﬁrst order theoryof the reals [CdAMR08, CdAMR10]. An earlier attempt of approximating the bisimilaritydistance for the more speciﬁc case of labelled Markov chains by expressing the problem inthe existential fragment of the ﬁrst order theory of the reals was proposed in [vBSW08].Its latest implementation using CVC4 [BCD +

11] is able to handle chains with 82 statesin approximately 66 hours . In this paper, we propose an alternative approach that isinspired by the successful implementations of similar pseudometrics on labelled Markovchains [BBLM13, TvB16, TvB18a].Our solution is based on a novel characterization of the probabilistic bisimilarity dis-tance as the solution of a simple stochastic game. Stochastic games were introduced byShapley [Sha53]. A simpliﬁed version of these games, called simple stochastic games , werestudied by Condon [Con92]. Several algorithms have been proposed to compute the valuefunction of a simple stochastic game, many using policy iteration. Condon [Con90] proposedan algorithm, known as simple policy iteration , that switches only one non-optimal choiceper iteration. The correctness of Condon’s algorithm, however, relies on the assumptionthat the game is stopping .It turns out that the simple stochastic games characterizing the probabilistic bisimilaritydistances are stopping only when the distances discount the future. In the case the distanceis non-discounting, the corresponding games may not be stopping. To recover correctness ofthe policy iteration procedure we adapt Condon’s simple policy iteration algorithm by addinga non-local update of the strategy of the min player and an extra termination condition basedon a notion of “self-closed” relation due to Fu [Fu12]. The practical eﬃciency of our algorithmhas been evaluated on a signiﬁcant set of randomly generated probabilistic automata. Theresults show that our algorithm performs better than the corresponding iterative algorithmsproposed for the discounted distances in [Fu12], even though the theoretical complexityof our proposal is exponential in the worst case ( cf. [TvB16]) whereas Fu’s is polynomial.The implementation of the algorithms exploits a coupling structure characterization of thedistance that allows us to skip the construction of the simple stochastic game which mayresult in an exponential blow up of the memory required for storing the game. The code is available at bitbucket.org/discoveri/first-order . G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL

The two characterizations of probabilistic bisimilarity distances proposed in this paper(either via simple stochastic games or coupling structures) crucially use a dual presentation ofthe Hausdorﬀ distance due to M´emoli [M´em11]. Still using M´emoli’s result, as an additionalcontribution to this paper we show that the (undiscounted) bisimilarity distance can be usedto bound the diﬀerence of the maximal (or minimal) probability of two states satisfyingarbitrary ω -regular speciﬁcations, expressed, e.g. , as LTL formulas. Notably, this resultallows us to relate the probabilistic bisimilarity pseudometric of Deng et al. to probabilisticmodel checking of probabilistic automata against linear-time speciﬁcations. Synopsis.

Section 2 introduces the notation and some preliminary results used in thepaper. In Section 3 we recall the deﬁnition of probabilistic bisimilarity distances of Denget al. for probabilistic automata; then, in Section 4 we propose a characterisation of theprobabilistic bisimilarity distances as the values of a simple stochastic game constructed fromthe automaton, here called probabilistic bisimilarity game. Towards an algorithmic solutionfor computing bisimilarity distances, in Section 5 we provide an alternative characterisationof the distances in terms of coupling structures . Section 6 describes a procedure for computingthe bisimilarity distances based on Condon’s simple policy iteration algorithm. In Section 7we discuss the relation between the notion of bisimilarity distance and probabilistic modelchecking of ω -regular linear-time speciﬁcations against probabilistic automata. Finally,Section 8 concludes with some remarks and future work directions.2. Preliminaries and Notation

The set of functions f from X to Y is denoted by Y X . We denote by f [ x/y ] ∈ Y X the updateof f at x ∈ X with y ∈ Y , deﬁned by f [ x/y ]( x (cid:48) ) = y if x (cid:48) = x , otherwise f [ x/y ]( x (cid:48) ) = f ( x (cid:48) ).A (1-bounded) pseudometric on a set X is a function d : X × X → [0 ,

1] such that, d ( x, x ) = 0, d ( x, y ) = d ( y, x ), and d ( x, y ) ≤ d ( x, z ) + d ( z, y ) for all x, y, z ∈ X .Kantorovich lifting. A (discrete) probability distribution on X is a function µ : X → [0 , (cid:80) x ∈ X µ ( x ) = 1, and its support is supp ( µ ) = { x ∈ X | µ ( x ) > } . We denoteby D ( X ) the set of probability distributions on X . A pseudometric d on X can be liftedto a pseudometric on probability distributions in D ( X ) by means of the Kantorovichlifting [Vil08].The Kantorovich lifting of d ∈ [0 , X × X on distributions µ, ν ∈ D ( X ) is deﬁned by K ( d )( µ, ν ) = sup (cid:40) (cid:88) x ∈ X f ( x ) · ( µ ( x ) − ν ( x )) | f ∈ L d (cid:41) , ( Kantorovich lifting) where L d denotes the set of non-expansive [0 , X , i.e., functions f : X → [0 ,

1] such that, for all x, y ∈ X , | f ( x ) − f ( y ) | ≤ d ( x, y ).The Kantorovich distance has the following well know dual formulation K ( d )( µ, ν ) = min  (cid:88) x,y ∈ X d ( x, y ) · ω ( x, y ) | ω ∈ Ω( µ, ν )  , where Ω( µ, ν ) denotes the set of measure-couplings for the pair ( µ, ν ), i.e., distributions ω ∈ D ( X × X ) such that, for all x ∈ X , (cid:80) y ∈ X ω ( x, y ) = µ ( x ) and (cid:80) y ∈ X ω ( y, x ) = ν ( x ). Itis a well known fact that this dual characterisation can be equivalently stated by ranging OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 5 ω over the set of vertices V (Ω( µ, ν )) of the polytope Ω( µ, ν ). Thus, a minimum is alwaysattained at a vertex. Furthermore, if the set X is ﬁnite, the set V (Ω( µ, ν )) is ﬁnite too.Hausdorﬀ lifting. A pseudometric d on X can be lifted to nonempty subsets of X by means ofthe Hausdorﬀ lifting. The Hausdorﬀ lifting of d ∈ [0 , X × X on nonempty subsets A, B ⊆ X is deﬁned by H ( d )( A, B ) = max (cid:26) sup a ∈ A inf b ∈ B d ( a, b ) , sup b ∈ B inf a ∈ A d ( a, b ) (cid:27) . ( Hausdorff lifting)

Following M´emoli [M´em11, Lemma 3.1], the Hausdorﬀ lifting has a dual characterizationin terms of set-couplings . Given A, B ⊆ X , a set-coupling for ( A, B ) is a relation R ⊆ X × X with left and right projections respectively equal to A and B , i.e., { a | ∃ b ∈ X. a R b } = A and { b | ∃ a ∈ X. a R b } = B . We write R ( A, B ) for the set of the set-couplings for (

A, B ). Theorem 2.1 ([M´em11]) . H ( d )( A, B ) = inf { sup ( a,b ) ∈ R d ( a, b ) | R ∈ R ( A, B ) } . Clearly, for ﬁnite

A, B , inf and sup in Theorem 2.1 can be replaced by min and max,respectively.3.

Probabilistic Automata and Probabilistic Bisimilarity Distance

In this section we recall some deﬁnitions and results from the literature. In particular, weintroduce the models of interest, probabilistic automata , its best known behavioural equiva-lence, Segala and Lynch’s probabilistic bisimilarity [SL94], and its quantitative generalizationdue to Deng et al. [DCPP06].A probabilistic automaton is a model of computation that combines nondeterministicand probabilistic behaviours. Similarly to a standard nondeterministic automaton, statesare labelled to express that certain properties of interest hold in that state. A probabilisticautomaton in a current state s ∈ S can nondeterministically proceed to a next probabilisticstate µ ∈ D ( S ), representing the probability distribution with which the automaton willmove to the next state. This can be formalised as follows: Deﬁnition 3.1. A probabilistic automaton (PA) is a tuple A = ( S, L, → , (cid:96) ) consisting ofa nonempty ﬁnite set S of states, a ﬁnite set of labels L , a ﬁnite total transition relation → ⊆ S × D ( S ), and a labelling function (cid:96) : S → L .For simplicity we assume the transition relation → to be total, that is, for all s ∈ S ,there exists a µ ∈ D ( S ) such that ( s, µ ) ∈ → . For the remainder of this paper we ﬁx aprobabilistic automaton A = ( S, L, → , (cid:96) ). We write s → µ to denote ( s, µ ) ∈ → and use δ ( s )to denote the set { µ | s → µ } of successor distributions of s .Next we recall the notion of probabilistic bisimilarity due to Segala and Lynch [SL94]for probabilistic automata. Their deﬁnition exploits the notion of lifting of a relation R ⊆ S × S on states to a relation ˜ R ⊆ D ( S ) × D ( S ) on probability distributions on states,originally introduced by Jonsson and Larsen [JL91], and deﬁned by µ ˜ R ν if there exists ameasure-coupling ω ∈ Ω( µ, ν ) such that supp ( ω ) ⊆ R . Deﬁnition 3.2.

A relation R ⊆ S × S is a probabilistic bisimulation if whenever s R t , • (cid:96) ( s ) = (cid:96) ( t ), M´emoli uses the terminology “correspondence.” To avoid confusion, we adopted the same terminologyused in [PC19, Section 10.6].

G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL • if s → µ then there exists t → ν such that µ ˜ R ν , and • if t → ν then there exists s → µ such that µ ˜ R ν .Two states s, t ∈ S are probabilistic bisimilar , written s ∼ t , if they are related by someprobabilistic bisimulation.Intuitively, two states are probabilistic bisimilar if they have the same label and eachtransition of the one state to a distribution µ can be matched by a transition of the otherstate to a distribution ν assigning the same probability to states that behave the same, andvice versa. Probabilistic bisimilarity is an equivalence relation and the largest probabilisticbisimulation.Deng et al. [DCPP06] proposed a family of 1-bounded pseudometrics d λ , parametric on a discount factor λ ∈ (0 , probabilistic bisimilarity pseudometrics . The pseudometrics d λ are deﬁned as the least ﬁxed-point of the functions ∆ λ : [0 , S × S → [0 , S × S ∆ λ ( d )( s, t ) = (cid:40) (cid:96) ( s ) (cid:54) = (cid:96) ( t ) λ · H ( K ( d ))( δ ( s ) , δ ( t )) otherwise . The well-deﬁnition of d λ follows by Knaster-Tarski’s ﬁxed point theorem, given the fact that∆ λ is a monotone function on the complete partial order of [0 , S × S ordered point-wise by d (cid:118) d (cid:48) iﬀ for all s, t ∈ S , d ( s, t ) ≤ d (cid:48) ( s, t ).The fact that probabilistic bisimilarity distances provide a quantitative generalization ofbisimilarity is captured by the following theorem due to Deng et al. [DCPP06, Corollary 2.14]. Theorem 3.3.

For all λ ∈ (0 , , d λ ( s, t ) = 0 if and only if s ∼ t . Probabilistic Bisimilarity Distance as a Simple Stochastic Game A simple stochastic game (SSG) consists of a ﬁnite directed graph whose vertices arepartitioned into sets of , , max vertices , min vertices , and random vertices .The game is played by two players, the max player and the min player , with a single token.At each step of the game, the token is moved from a vertex to one of its successors. At amin vertex the min player chooses the successor, at a max vertex the max player choosesthe successor, and at a random vertex the successor is chosen randomly according to aprescribed probability distribution. The max player wins a play of the game if the tokenreaches a 1-sink and the min player wins if the play reaches a 0-sink or continues foreverwithout reaching a sink. Since the game is stochastic, the max player tries to maximize theprobability of reaching a 1-sink whereas the min player tries to minimize that probability. Deﬁnition 4.1. A simple stochastic game is a tuple ( V, E, P ) consisting of • a ﬁnite directed graph ( V, E ) such that – V is partitioned into the sets: V of , V of , V max of max vertices , V min of min vertices , and V rnd of random vertices ; – the vertices in V and V have outdegree zero and all other vertices have outdegree atleast one, and • a function P : V rnd → D ( V ) such that for all v ∈ V rnd and w ∈ V , P ( v )( w ) > v, w ) ∈ E .The above deﬁnition is slightly more general than the one given by Condon in [Con92,Section 2]. Note that the outdegree of min, max and random vertices is at least one (instead OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 7 of exactly two), there may be multiple 0-sinks and 1-sinks (rather than exactly one). However,a simple stochastic game as deﬁned above can be transformed in polynomial-time into asimple stochastic game as deﬁned in [Con92], as shown by Zwick and Paterson [ZP96].A strategy , also known as policy , for the min player is a function σ min : V min → V that assigns the target of an outgoing edge to each min vertex, that is, for all v ∈ V min ,( v, σ min ( v )) ∈ E . Likewise, a strategy for the max player is a function σ max : V max → V thatassigns the target of an outgoing edge to each max vertex. These strategies are known as pure stationary strategies. We can restrict ourselves to these strategies since the optimalamong all strategies for both players are of this type (see, for example, [LL69]).Such strategies determine a sub-game in which each max vertex and each min vertex hasoutdegree one (see [Con92, Section 2] for details). Such a game can naturally be viewed as aMarkov chain. We write φ σ min ,σ max : V → [0 ,

1] for the function that gives the probability ofa vertex in this Markov chain to reach a 1-sink.The value function φ : V → [0 ,

1] of a SSG is deﬁned as min σ min max σ max φ σ min ,σ max . Itis folklore that the value function of a simple stochastic game can be characterised as theleast ﬁxed point of the following function (see, for example, [Jub05, Section 2.2 and 2.3]). Deﬁnition 4.2.

The function Φ : [0 , V → [0 , V is deﬁned byΦ( f )( v ) =  v ∈ V v ∈ V max ( v,w ) ∈ E f ( w ) if v ∈ V max min ( v,w ) ∈ E f ( w ) if v ∈ V min (cid:80) ( v,w ) ∈ E P ( v )( w ) f ( w ) if v ∈ V rnd The [0 , V can be ordered point-wise by f (cid:118) g iﬀ for all v ∈ V , f ( v ) ≤ g ( v ). This partial order is complete in [0 , V , with meet and join respectively givenby the point-wise inﬁmum and supremum.Then, the existence the least ﬁxed point of Φ is ensured by Knaster-Tarski’s ﬁxed pointtheorem and the following result. Proposition 4.3.

The function Φ is monotone.Proof. Let f, g ∈ [0 , V and f (cid:118) g . Let v ∈ V . It suﬃces to show that Φ( f )( v ) ≤ Φ( g )( v ).We distinguish the following cases. • If v ∈ V then Φ( f )( v ) = 0 = Φ( g )( v ). • If v ∈ V then Φ( f )( v ) = 1 = Φ( g )( v ). • If v ∈ V max then Φ( f )( v ) = max ( v,w ) ∈ E f ( w ) ≤ max ( v,w ) ∈ E g ( w ) = Φ( g )( v ). • If v ∈ V min then Φ( f )( v ) = min ( v,w ) ∈ E f ( w ) ≤ min ( v,w ) ∈ E g ( w ) = Φ( g )( v ). • If v ∈ V rnd then Φ( f )( v ) = (cid:80) ( v,w ) ∈ E P ( v )( w ) f ( w ) ≤ (cid:80) ( v,w ) ∈ E P ( v )( w ) g ( w ) = Φ( g )( v ).The set [0 , V can be turned into a Banach space by means of the supremum norm (cid:107) f (cid:107) = max v ∈ V f ( v ). Recall that a function F : [0 , V → [0 , V is non-expansive if for all f, g ∈ [0 , V , (cid:107) f − g (cid:107) ≥ (cid:107) F ( f ) − F ( g ) (cid:107) . Proposition 4.4.

The function Φ is nonexpansive. G. BACCI, G. BACCI, K. G. LARSEN, R. MARDARE, Q. TANG, AND F. VAN BREUGEL

Proof.

Let f, g ∈ [0 , V . Let v ∈ V . It suﬃces to show that | Φ( f )( v ) − Φ( g )( v ) | ≤ (cid:107) f − g (cid:107) .We distinguish the following cases. • If v ∈ V then | Φ( f )( v ) − Φ( g )( v ) | = | − | = 0 ≤ (cid:107) f − g (cid:107) . • If v ∈ V then | Φ( f )( v ) − Φ( g )( v ) | = | − | = 0 ≤ (cid:107) f − g (cid:107) . • Let v ∈ V max . Without loss of generality, assume that max ( v,w ) ∈ E f ( w ) ≥ max ( v,w ) ∈ E g ( w ).Then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12) max ( v,w ) ∈ E f ( w ) − max ( v,w ) ∈ E g ( w ) (cid:12)(cid:12)(cid:12)(cid:12) = max ( v,w ) ∈ E f ( w ) − max ( v,w ) ∈ E g ( w )= f ( x ) − max ( v,w ) ∈ E g ( w ) ≤ f ( x ) − g ( x ) ≤ (cid:107) f − g (cid:107) , where x ∈ V realises the maximum of { f ( w ) | ( v, w ) ∈ E } . • Let v ∈ V min . Without loss of generality, assume that min ( v,w ) ∈ E f ( w ) ≥ min ( v,w ) ∈ E g ( w ).Then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12) min ( v,w ) ∈ E f ( w ) − min ( v,w ) ∈ E g ( w ) (cid:12)(cid:12)(cid:12)(cid:12) = min ( v,w ) ∈ E f ( w ) − min ( v,w ) ∈ E g ( w )= min ( v,w ) ∈ E f ( w ) − g ( x ) ≤ f ( x ) − g ( x ) ≤ (cid:107) f − g (cid:107) , where x ∈ V realises the minimum of { g ( w ) | ( v, w ) ∈ E } . • If v ∈ V rnd then | Φ( f )( v ) − Φ( g )( v ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) ( v,w ) ∈ E P ( v )( w ) f ( w ) − (cid:88) ( v,w ) ∈ E P ( v )( w ) g ( w ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) ( v,w ) ∈ E P ( v )( w ) ( f ( w ) − g ( w )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) ( v,w ) ∈ E P ( v )( w ) (cid:107) f − g (cid:107)≤ (cid:107) f − g (cid:107) . A Probabilistic Bisimilarity Game.

Fix a probabilistic automaton A and λ ∈ (0 , probabilistic bisimilarity game , where the min player tries to showthat two states are probabilistic bisimilar, while the max player tries to prove the opposite.In our probabilistic bisimilarity game, there is a vertex ( s, t ) for each pair states s and t in A . If (cid:96) ( s ) (cid:54) = (cid:96) ( t ) then the vertex ( s, t ) is a 1-sink. Otherwise, ( s, t ) is a min vertex. In OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 9 this vertex, the min player selects a set R ∈ R ( δ ( s ) , δ ( t )) of pairs of transitions. This set R captures potential matchings of transitions from state s and state t . Subsequently, the maxplayer chooses a pair of transitions from the set R . Once the max player has chosen a pair( µ, ν ) from the set R corresponding to the transitions s → µ and t → ν , the min player canchoose a measure-coupling ω ∈ Ω( µ, ν ). To ensure that the game graph is ﬁnite, we restrictour attention to the vertices V (Ω( µ, ν )) of the polytope Ω( µ, ν ). Such a measure-coupling ω captures a matching of the probability distributions µ and ν . Recall that a measure-couplingis a probability distribution on S × S . From a random vertex ω , the game proceeds tovertex ( u, v ) with probability λ · ω ( u, v ) and to the 0-sink vertex ⊥ with probability 1 − λ .Intuitively, the choices of R ∈ R ( δ ( s ) , δ ( t )) and then ( µ, ν ) ∈ R , performed respectively bythe min and the max player, correspond to the min and max of Theorem 2.1; analogously,the selection of ω ∈ V (Ω( µ, ν )) by the min player models the min in the deﬁnition of theKantorovich lifting.Formally, our probabilistic bisimilarity game for the automaton A is deﬁned as follows. Deﬁnition 4.5.

Let λ ∈ (0 , probabilistic bisimilarity game ( V, E, P ) is deﬁned by • V = {⊥} , • V = (cid:8) ( s, t ) ∈ S × S | (cid:96) ( s ) (cid:54) = (cid:96) ( t ) (cid:9) , • V max = (cid:83) (cid:8) R ( δ ( s ) , δ ( t )) | ( s, t ) ∈ V min (cid:9) , • V min = (cid:8) ( s, t ) ∈ S × S | (cid:96) ( s ) = (cid:96) ( t ) (cid:9) ∪ (cid:83) (cid:8) R | R ∈ V max (cid:9) , • V rnd = (cid:83) (cid:8) V (Ω( µ, ν )) | ( µ, ν ) ∈ V min (cid:9) , E = (cid:8) (( s, t ) , R ) | ( s, t ) ∈ V min ∧ R ∈ R ( δ ( s ) , δ ( t )) (cid:9) ∪ (cid:8) ( R, ( µ, ν )) | R ∈ V max ∧ ( µ, ν ) ∈ R (cid:9) ∪ (cid:8) (( µ, ν ) , ω ) | ( µ, ν ) ∈ V min ∧ ω ∈ V (Ω( µ, ν )) (cid:9) ∪ (cid:8) ( ω, ( u, v )) | ω ∈ V rnd ∧ ( u, v ) ∈ supp ( ω ) (cid:9) ∪ (cid:8) ( ω, ⊥ ) | ω ∈ V rnd (cid:9) , and, for all ω ∈ V rnd and ( s, t ) ∈ supp ( ω ), P ( ω )(( s, t )) = λ · ω ( s, t ) and P ( ω )( ⊥ ) = 1 − λ .By construction of the probabilistic bisimilarity game, there is a direct correspondencebetween the function Φ from Deﬁnition 4.2 associated to the probabilistic bisimilarity gameand the function ∆ λ from Section 3 associated to the probabilistic automaton. From thiscorrespondence it is straightforward that the respective least ﬁxed points of Φ and ∆ λ agree,that is, the probabilistic bisimilarity distances of a probabilistic automaton are the values ofthe corresponding vertices of the probabilistic bisimilarity game. Theorem 4.6.

For all λ ∈ (0 , and s, t ∈ S , d λ ( s, t ) = φ ( s, t ) .Proof. The proof is similar to that of [vBW14, Theorem 14].Let φ be the value function of the probabilistic bisimilarity game. Since Φ is monotoneand non-expansive (Propositions 4.3 and 4.4), we conclude from [vB12, Corollary 1] thatthe closure ordinal of Φ is ω , that is, φ is the least upper bound of { Φ n ( ) | n ∈ N } ,where the function maps every vertex to zero. Similarly, d λ is the least upper bound of { ∆ nλ ( ) | n ∈ N } , where the function maps every state pair to zero. Therefore, it suﬃcesto show that for all s, t ∈ S and n ∈ N ,Φ n ( )( s, t ) = ∆ nλ ( )( s, t )by induction on n . Obviously, the above holds if n = 0. Let n >

0. We distinguish thefollowing cases. tu v

12 12

11 1 R = { ( t , u ) , ( u + v , u ) } , R (cid:48) = { ( u , u ) } . ( t, u ) R (cid:0) t , u (cid:1) (cid:0) u + v , u (cid:1) ( t,u ) 12 ( u,u ) + ( v,u ) ( u, u ) ( v, u ) R (cid:48) ( u , u ) ( u,u )

12 12

Figure 2: (Top left:) A probabilistic automaton and (Right:) the associated simple stochasticgame constructed as in Deﬁnition 4.5 for λ = 1 (only the portion reachable from( t, u ) is shown), where x denotes the Dirac distribution concentrated at x . • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ) then the vertex ( s, t ) is a 1-sink and, hence, Φ n ( )( s, t ) = 1 = ∆ nλ ( )( s, t ) . • If (cid:96) ( s ) = (cid:96) ( t ) thenΦ n ( )( s, t ) == min R ∈R ( δ ( s ) ,δ ( t )) Φ n − ( )( R )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R Φ n − ( )( µ, ν )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) Φ n − ( )( ω )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) (cid:88) ( u,v ) ∈ supp ( ω ) λω ( u, v ) Φ n − ( )( u, v ) + (1 − λ )Φ n − ( )( ⊥ )= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) λ (cid:88) u,v ∈ S ω ( u, v ) Φ n − ( )( u, v ) ( ⊥ is a 0-sink)= min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R min ω ∈ V (Ω( µ,ν )) λ (cid:88) u,v ∈ S ω ( u, v ) ∆ n − λ ( )( u, v ) (by induction)= λ · min R ∈R ( δ ( s ) ,δ ( t )) max ( µ,ν ) ∈ R K (∆ n − λ ( ))( µ, ν )= λ · H ( K (∆ n − λ ( )))( δ ( s ) , δ ( t )) (Theorem 2.1)= ∆ nλ ( )( s, t ) . Consider a state pair ( s, t ) with s ∼ t . By Theorem 3.3, d λ ( s, t ) = 0. Hence, fromTheorem 4.6 we can conclude that φ ( s, t ) = 0. Therefore, by pre-computing probabilisticbisimilarity, ( s, t ) can be represented as a 0-sink, rather than a min vertex. For example, inFigure 2 this amounts to turning ( u, u ) into a 0-sink and disconnecting it from its successors.Games similar to the above introduced probabilistic bisimilarity game have been pre-sented in [DLT08, vBW14, FKP17, KM18]. The game presented by van Breugel and Worrell OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 11 in [vBW14] is most closely related to our game. They also consider probabilistic automataand map a probabilistic automaton to a simple stochastic game. The only diﬀerence is thatthey use the original deﬁnition of the Hausdorﬀ distance, whereas we use M´emoli’s alterna-tive characterization. The games described in [DLT08, FKP17, KM18] are not stochastic.Desharnais, Laviolette and Tracol [DLT08] deﬁne an (cid:15) -probabilistic bisimulation game forprobabilistic automata, where (cid:15) >

A Coupling Characterisation of the Bisimilarity Distance

In this section we provide an alternative characterisation for the probabilistic bisimilaritydistance d λ based on the notion of coupling structure for a probabilistic automaton. Thischaracterisation generalises the one by Chen et al. [CvBW12, Theorem 8] (see also [BBLM13,Theorem 8]) for the bisimilarity pseudometric of Desharnais et al. [DGJP04] for labelledMarkov chains. Our construction exploits M´emoli’s dual characterisation of the Hausdorﬀdistance (Theorem 2.1). Deﬁnition 5.1. A coupling structure for A is a tuple C = ( f, ρ ) consisting of • a map f : D ( S ) × D ( S ) → D ( S × S ) such that, for all µ, ν ∈ D ( S ), f ( µ, ν ) ∈ Ω( µ, ν ), and • a map ρ : S × S → D ( S ) ×D ( S ) , such that for all s, t ∈ S , ρ ( s, t ) ∈ R ( δ ( s ) , δ ( t )).For convenience, the components f and ρ of a coupling structure will be called measure-coupling map and set-coupling map , respectively.The deﬁnition of coupling structure is better understood in relation to the automaton itinduces. The probabilistic automaton induced from C = ( f, ρ ), denoted A C = ( S × S, L × L, → C , (cid:96) C ) , has S × S as set of states, L × L as set of labels, transition relation → C ⊆ ( S × S ) × D ( S × S ),deﬁned as ( s, t ) → C f ( µ, ν ) if ( µ, ν ) ∈ ρ ( s, t ), and labeling function (cid:96) C : S × S → L × L deﬁned as (cid:96) C ( s, t ) = ( (cid:96) ( s ) , (cid:96) ( t )). Intuitively, A C describes the concurrent execution of twocopies of the probabilistic automaton A , synchronized by the coupling structure C .Let λ ∈ (0 , C we deﬁne the function Γ C λ : [0 , S × S → [0 , S × S asΓ C λ ( d )( s, t ) = (cid:40) (cid:96) ( s ) (cid:54) = (cid:96) ( t ) λ · max { (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } otherwise. Lemma 5.2.

The function Γ C λ is well-deﬁned and monotone.Proof. The well deﬁnition of Γ C λ follows by the fact that λ ∈ (0 ,

1] and (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )is a convex combination of a sequence of [0 , d ( u, v )) u,v ∈ S .As for monotonicity, let d, d (cid:48) ∈ [0 , S × S and d (cid:118) d (cid:48) . Let s, t ∈ S , it suﬃces to show thatΓ C λ ( d )( s, t ) ≤ Γ C λ ( d (cid:48) )( s, t ). We distinguish the following cases: • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then Γ C λ ( d )( s, t ) = 1 = Γ C λ ( d (cid:48) )( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ), then we haveΓ C λ ( d )( s, t ) = λ · max { (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } = λ (cid:80) u,v ∈ S d ( u, v ) · ω ∗ ( u, v ) (for some ( s, t ) → C ω ∗ ) ≤ λ (cid:80) u,v ∈ S d (cid:48) ( u, v ) · ω ∗ ( u, v ) ( d (cid:118) d (cid:48) and ω ∗ ( u, v ) ≥ u, v ∈ S ) ≤ λ · max { (cid:80) u,v ∈ S d (cid:48) ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (( s, t ) → C ω ∗ )= Γ C λ ( d (cid:48) )( s, t ) . By Knaster-Tarski’s ﬁxed point theorem, Γ C λ has a least ﬁxed point, denoted by γ C λ . Asin [BBLM13], we call γ C λ the λ -discounted discrepancy w.r.t. C or simply λ -discrepancy . Remark 5.3.

Note that, the 1-discrepancy γ C ( s, t ) is the maximal probability of reachinga pair of states ( u, v ) in the probabilistic automaton A C such that (cid:96) ( u ) (cid:54) = (cid:96) ( v ) by startingfrom the state pair ( s, t ). It is well known that the maximal reachability probability canbe computed in polynomial-time as the optimal solution of a linear program (see [BK08,Theorem 10.100] or [Put94, Chapter 6]). The linear program can be trivially generalized tocompute γ C λ , for any λ ∈ (0 , Lemma 5.4.

For all λ ∈ (0 , and coupling structure C of A , ∆ λ ( γ C λ ) (cid:118) γ C λ .Proof. Let C = ( f, ρ ). Let s, t ∈ S and R = ρ ( s, t ). We distinguish two cases. • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then ∆ λ ( γ C λ )( s, t ) = 1 = γ C λ ( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ),∆ λ ( γ C λ )( s, t ) = λ · H ( K ( γ C λ ))( δ ( s ) , δ ( t )) (def. ∆ λ )= λ · min { max ( µ,ν ) ∈ R (cid:48) K ( γ C λ )( µ, ν ) | R (cid:48) ∈ R ( δ ( s ) , δ ( t )) } (Theorem 2.1) ≤ λ · max ( µ,ν ) ∈ R K ( γ C λ )( µ, ν ) ( R ∈ R ( δ ( s ) , δ ( t )))= λ · max ( µ,ν ) ∈ R min ω ∈ Ω( µ,ν ) (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) (def. K ( γ C λ )) ≤ λ · max ( µ,ν ) ∈ R (cid:80) u,v ∈ S γ C λ ( u, v ) · f ( µ, ν )( u, v ) ( f ( µ, ν ) ∈ Ω( µ, ν ))= λ · max { (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (def. → C )= Γ C λ ( γ C λ )( s, t ) (def. Γ C λ )= γ C λ ( s, t ) . ( γ C λ ﬁxed point of Γ C λ )By the generality of the chosen s and t , we conclude that ∆ λ ( γ C λ ) (cid:118) γ C λ . Corollary 5.5.

For all λ ∈ (0 , and coupling structure C for A , d λ (cid:118) γ C λ .Proof. By Knaster-Tarski’s ﬁxed point theorem, d λ is the least preﬁx point of ∆ λ , thereforeby Lemma 5.4 we can conclude that d λ (cid:118) γ C λ .The next lemma shows that the probabilistic bisimilarity distance can be characterisedas the λ -discrepancy for a vertex coupling structure , that is, a coupling structure C = ( f, ρ )such that f ( µ, ν ) ∈ V (Ω( µ, ν )) for all µ, ν ∈ D ( S ). Lemma 5.6.

For all λ ∈ (0 , there exists a vertex coupling structure C for A such that d λ = γ C λ . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 13

Proof.

We construct a vertex coupling structure C = ( f, ρ ) as follows.We deﬁne f : D ( S ) × D ( S ) → D ( S × S ) by f ( µ, ν ) ∈ argmin ω ∈ V (Ω( µ,ν )) (cid:80) u,v ∈ S d λ ( u, v ) · ω ( u, v ) . Hence, K ( d λ )( µ, ν ) = (cid:80) u,v ∈ S d λ ( u, v ) · f ( µ, ν )( u, v ) . (5.1)We deﬁne ρ : S × S → D ( S ) ×D ( S ) by ρ ( s, t ) = (cid:8)(cid:0) µ, argmin ν ∈ δ ( t ) K ( d λ )( µ, ν ) (cid:1) | µ ∈ δ ( s ) (cid:9) ∪ (cid:8)(cid:0) argmin µ ∈ δ ( s ) K ( d λ )( µ, ν ) , ν (cid:1) | ν ∈ δ ( t ) (cid:9) . Hence, ρ ( s, t ) ∈ R ( δ ( s ) , δ ( t )) and H ( K ( d λ ))( δ ( s ) , δ ( t )) = max {K ( d λ )( µ, ν ) | ( µ, ν ) ∈ ρ ( s, t ) } . (5.2)Next, we show that Γ C λ ( d λ ) (cid:118) d λ . Let s, t ∈ S . We distinguish two cases: • If (cid:96) ( s ) (cid:54) = (cid:96) ( t ), then d λ ( s, t ) = ∆ λ ( d λ )( s, t ) = 1 = Γ C λ ( d λ )( s, t ). • If (cid:96) ( s ) = (cid:96) ( t ), we haveΓ C λ ( d λ )( s, t ) = λ · max { (cid:80) u,v ∈ S d λ ( u, v ) · ω ( u, v ) | ( s, t ) → C ω } (def. Γ C λ )= λ · max { (cid:80) u,v ∈ S d λ ( u, v ) · f ( µ, ν )( u, v ) | ( µ, ν ) ∈ ρ ( s, t ) } (def. → C )= λ · max {K ( d λ )( µ, ν ) | ( µ, ν ) ∈ ρ ( s, t ) } (eq. (5.1))= λ · H ( K ( d λ ))( δ ( s ) , δ ( t )) (eq. (5.2))= d λ ( s, t ) . ( d λ ﬁxed point of ∆ λ )Therefore Γ C λ ( d λ ) = d λ . Since γ C λ is the least ﬁxed point of Γ C λ , by Knaster-Tarski’s ﬁxedpoint theorem γ C λ (cid:118) d λ . Moreover, by Corollary 5.5, d λ (cid:118) γ C λ . Thus d λ = γ C λ . Theorem 5.7.

Let λ ∈ (0 , . Then, the following hold: (1) d λ = (cid:117){ γ C λ | C coupling structure for A} ; (2) s ∼ t iﬀ γ C λ ( s, t ) = 0 for some vertex coupling structure C for A .Proof. (1) follows by Corollary 5.5 and Lemma 5.6; (2) by Theorem 3.3 and Lemma 5.6.Note that together with Lemma 5.6, Theorem 5.7.1 states that d λ is the minimal λ -discrepancy obtained by ranging over the subset of vertex coupling structures. Remark 5.8 (On the relation with probabilistic bisimilarity games) . The coupling structurecharacterization of the distance is strongly related to the simple stochastic game charac-terization presented in Section 4. Indeed, the notion of vertex coupling structure capturesessentially the strategies for the min player on a probabilistic bisimilarity game in thefollowing sense: the measure-coupling map component describes the strategy on the verticesof the form ( µ, ν ) ∈ R for some R ∈ V max , while the set-coupling map deals with thedescription of the strategy on the min vertices ( s, t ) ∈ S × S . The discrepancy γ C capturesthe value w.r.t an optimal strategy for the max player when the min player has ﬁxed theirstrategy a priori. Computing the Bisimilarity Distance

We describe a procedure for computing the bisimilarity distances based on Condon’s simplepolicy iteration algorithm [Con90]. Our procedure extends a similar one proposed in [TvB16,BBLM13] for computing the bisimilarity distances of Desharnais et al. [DGJP04] for labelledMarkov chains. The extension takes into account the additional presence of nondeterminismin the choice of the transitions.Condon’s simple policy iteration algorithm computes the values of a simple stochasticgame provided that the game is stopping , i.e., for each pair of strategies for the min andmax players the token reaches a 0-sink or 1-sink vertex with probability one.As we have shown in Theorem 4.6, the probabilistic bisimilarity distances are the valuesof the corresponding vertices in the simple stochastic game given in Deﬁnition 4.5. Thus, ifwe prove that the game is stopping we can apply Condon’s simple policy iteration algorithmto compute the probabilistic bisimilarity distances.

Proposition 6.1.

For λ ∈ (0 , , the simple stochastic game in Deﬁnition 4.5 is stopping.Proof. For each pair of strategies for the min and max players, each vertex in the inducedMarkov chain reaches the 0-sink vertex ⊥ with probability at least 1 − λ . Since λ <

1, fromany state, the probability of never reaching ⊥ is zero, i.e., the probability of eventuallyreaching the sink state ⊥ is one.However, for λ = 1 the game in Deﬁnition 4.5 may not be stopping as shown below. Example 6.2.

Consider the probabilistic automaton in Figure 2 and its associated prob-abilistic bisimilarity game. By choosing a strategy σ max for the max player such that σ max ( R ) = ( t , u ), the vertex ( t, u ) has probability zero to reach a sink. This can be seenin Figure 2, since there are no paths using the edge ( R, ( t , u )) leading to a sink.In [TvB16], by imposing the bisimilar state pairs to be 0-sinks, for the case of labelledMarkov chains the simple stochastic game was proven to be stopping. This method does notgeneralize to probabilistic automata. Indeed, Example 6.2 provides a counterexample evenwhen bisimilar state pairs are 0-sinks.In the remainder of the section, we provide a general algorithm to compute the bisimilaritydistance for every λ ∈ (0 , Simple Policy Iteration Strategy.

Condon’s algorithm iteratively updates thestrategies of the min and max players in turn, on the basis of the current over-approximationof the value of the game. Next we show how Condon’s policy updates can be performeddirectly on coupling structures.For the update of the coupling structure, we use a measure-coupling map k ( d )( µ, ν ) ∈ V (Ω( µ, ν )) and a set-coupling map h ( d )( s, t ) ∈ R ( δ ( s ) , δ ( t )) such that k ( d )( µ, ν ) ∈ argmin (cid:110) (cid:80) u,v ∈ S ω ( u, v ) · d ( u, v ) | ω ∈ V (Ω( µ, ν )) (cid:111) , and (6.1) h ( d )( s, t ) ∈ argmin (cid:110) max ( µ,ν ) ∈ R K ( d )( µ, ν ) | R ∈ R ( δ ( s ) , δ ( t )) (cid:111) . (6.2)for d : S × S → [0 , µ, ν ∈ D ( S ), and s, t ∈ S . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 15

The following lemma explains how the above ingredients can be used by the min playerto improve its strategy.

Lemma 6.3.

Let C = ( f, ρ ) . If there exist s, t ∈ S such that ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ) then, γ D λ (cid:60) γ C λ for a coupling structure D = ( k ( γ C λ ) , ρ [( s, t ) /R ]) , where R = h ( γ C λ )( s, t ) .Proof. Assume ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ). Next we show Γ D λ ( γ C λ ) (cid:60) γ C λ . In particular we provethat Γ D λ ( γ C λ )( s, t ) < γ C λ ( s, t ) and, for all ( u, v ) (cid:54) = ( s, t ), Γ D λ ( γ C λ )( u, v ) ≤ γ C λ ( u, v ).By ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ), we necessarily have (cid:96) ( s ) = (cid:96) ( t ). Thus∆ λ ( γ C λ )( s, t ) = λ · H ( K ( γ C λ ))( δ ( s ) , δ ( t )) ( (cid:96) ( s ) = (cid:96) ( t ) and def. ∆ λ )= λ · min { max ( µ,ν ) ∈ R (cid:48) K ( γ C λ )( µ, ν ) | R (cid:48) ∈ R ( δ ( s ) , δ ( t )) } (Theorem 2.1)= λ · max ( µ,ν ) ∈ R K ( γ C λ )( µ, ν ) ( R = h ( γ C λ )( s, t ) and (6.2))= λ · max ( µ,ν ) ∈ R min ω ∈ Ω( µ,ν ) (cid:80) u,v ∈ S γ C λ ( u, v ) · ω ( u, v ) (def. K ( γ C λ ))= λ · max ( µ,ν ) ∈ R (cid:80) u,v ∈ S γ C λ ( u, v ) · k ( γ C λ )( µ, ν )( u, v ) (by (6.1))= Γ D λ ( γ C λ )( s, t ) . (def. D and Γ D λ )Therefore, Γ D λ ( γ C λ )( s, t ) = ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ).Let u, v ∈ S such that ( u, v ) (cid:54) = ( s, t ). We distinguish two cases. • If (cid:96) ( u ) (cid:54) = (cid:96) ( v ), then Γ D λ ( γ C λ )( u, v ) = 1 = Γ C λ ( γ C λ )( u, v ) = γ C λ ( u, v ). • If (cid:96) ( u ) = (cid:96) ( v ), thenΓ D λ ( γ C λ )( u, v ) = λ · max ( µ,ν ) ∈ ρ ( u,v ) (cid:80) x,y ∈ S k ( γ C λ )( µ, ν )( x, y ) · γ C λ ( x, y ) (def. Γ D λ and D ) ≤ λ · max ( µ,ν ) ∈ ρ ( u,v ) (cid:80) x,y ∈ S f ( µ, ν )( x, y ) · γ C λ ( x, y ) ((6.1), f ( µ, ν ) ∈ Ω( µ, ν ))= Γ C λ ( γ C λ )( u, v ) = γ C λ ( u, v ) . (def. Γ C λ and γ C λ )Thus Γ D λ ( γ C λ ) (cid:60) γ C λ . By Knaster-Tarski’s ﬁxed point theorem, we conclude that γ D λ (cid:60) γ C λ .Lemma 6.3 suggests that C = ( f, ρ ) can be improved by replacing the measure-couplingmap f with k ( γ C λ ) and updating the set-coupling map ρ at ( s, t ) with R = h ( γ C λ )( s, t ).Note that a measure-coupling k ( d )( µ, ν ) satisfying (6.1) can be computed by solv-ing a linear program and ensuring that the optimal solution is a vertex of the polytopeΩ( µ, ν ) [Orl85, KS95]. A set-coupling h ( d )( s, t ) satisfying (6.2) is the following: R = (cid:8) ( µ, φ ( µ )) | µ ∈ δ ( s ) (cid:9) ∪ (cid:8) ( ψ ( ν ) , ν ) | ν ∈ δ ( t ) (cid:9) ∈ R ( δ ( s ) , δ ( t )) , (6.3)where φ, ψ are such that φ ( µ ) ∈ argmin ν ∈ δ ( t ) K ( d )( µ, ν ) and ψ ( ν ) ∈ argmin µ ∈ δ ( s ) K ( d )( µ, ν ).The following lemma justiﬁes our choice of h ( d )( s, t ). Lemma 6.4.

Let R be as in (6.3) . Then H ( K ( d ))( δ ( s ) , δ ( t )) = max ( µ,ν ) ∈ R K ( d )( µ, ν ) .Proof. By Theorem 2.1 and R ∈ R ( δ ( s ) , δ ( t )), we have H ( K ( d ))( δ ( s ) , δ ( t )) ≤ max ( µ,ν ) ∈ R K ( d )( µ, ν ) . Hence, it suﬃces to prove(i) K ( d )( µ, φ ( µ )) ≤ H ( K ( d ))( δ ( s ) , δ ( t )), for all µ ∈ δ ( s ), and(ii) K ( d )( ψ ( ν ) , ν ) ≤ H ( K ( d ))( δ ( s ) , δ ( t )), for all ν ∈ δ ( t ). Algorithm 1:

Simple policy iteration algorithm computing d λ for λ ∈ (0 , Initialise C = ( f, ρ ) as an arbitrary vertex coupling structure for A while ∃ ( s, t ) . ∆ λ ( γ C λ )( s, t ) < γ C λ ( s, t ) do R ← h ( γ C λ )( s, t ) C ← (cid:0) k ( γ C λ ) , ρ [( s, t ) /R ] (cid:1) /* update coupling structure */ end return γ C λ /* γ C λ = d λ */ We prove (i). Let µ ∈ δ ( s ). Then H ( K ( d ))( δ ( s ) , δ ( t )) ≥ max µ (cid:48) ∈ δ ( s ) min ν ∈ δ ( t ) K ( d )( µ (cid:48) , ν ) (def. H ) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) ( µ ∈ δ ( s ))= K ( d )( µ, φ ( µ )) ( φ ( µ ) ∈ argmin ν ∈ δ ( t ) K ( d )( µ, ν ))The proof for (ii) follows similarly. Remark 6.5.

The update procedure entailed by Lemma 6.3 can be performed in polynomial-time in the size of the probabilistic automaton A . Indeed, k ( d )( µ, ν ) can be obtained bysolving a transportation problem in polynomial time [Orl85, KS95]. As for h ( d )( s, t ), onecan obtain φ ( µ ) (resp. ψ ( ν )) by computing K ( d )( µ, ν ) in polynomial time and selecting the ν (resp. µ ) ranging over δ ( t ) (resp. δ ( s )) that achieves the minimum.6.2. Discounted case.

The simple policy iteration algorithm for computing d λ in the case λ < C (line 1), e.g. , by using the North-West corner method in polynomialtime (see, e.g. , [Str89, pg. 180]). Then it continues by iteratively generating a sequence C , C , . . . , C n of vertex coupling structures where d λ = γ C n λ . At each iteration, the currentcoupling structure C i is tested for optimality (line 2) by checking whether the corresponding λ -discrepancy γ C i λ is a ﬁxed point for ∆ λ . If there exists ( s, t ) ∈ S violating the equality γ C i λ = ∆ λ ( γ C i λ ), it constructs C i +1 by updating C i at ( s, t ) as prescribed by Lemma 6.3 (line 4).This guarantees that γ C i λ (cid:61) γ C i +1 λ , i.e., a strict improvement of the λ -discrepancy towardsthe minimal one.Termination follows by the fact that there are only ﬁnitely many vertex couplingstructures for A . Furthermore, the correctness of the output of the algorithm is due to thefact that, ∆ λ has a unique ﬁxed point when 0 ≤ λ < Theorem 6.6.

Let λ ∈ (0 , . Algorithm 1 is terminates and computes d λ .Proof. First we prove termination. Note that the set { γ C λ | C vertex coupling structure for A} (6.4)is ﬁnite because for all s, t ∈ S the set R ( δ ( s ) , δ ( t )) is ﬁnite, and for all µ ∈ δ ( s ) and ν ∈ δ ( t ) the polytope Ω( µ, ν ) has ﬁnitely many vertices, i.e., V (Ω( µ, ν )) is ﬁnite. Towards acontradiction, assume that Algorithm 1 does not terminate. Let C , C , C , . . . be the inﬁnitesequence of coupling structures generated during a non-terminating execution of Algorithm 1.Since the set in (6.4) is ﬁnite, there must be i < j such that γ C i λ = γ C j λ . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 17

On the contrary, next we prove that the updates of the coupling structures in Algorithm 1ensure that for all n ∈ N , γ C n λ (cid:61) γ C n +1 λ . Let n ∈ N . Since we are considering a non-terminatingexecution we have that ∆ λ ( γ C n λ )( s, t ) < γ C n λ ( s, t ), for some s, t ∈ S . C n +1 is obtained fromthe update performed in line 4, which is exactly the one prescribed by Lemma 6.3. Thereforewe have that γ C n λ (cid:61) γ C n +1 λ . Hence, Algorithm 1 terminates.When the execution of Algorithm 1 has reached the return statement, we have that∆ λ ( γ C n λ )( s, t ) ≥ γ C n λ ( s, t ) for all s, t ∈ S , i.e., γ C n λ (cid:118) ∆ λ ( γ C n ). By Lemma 5.4, γ C n λ (cid:119) ∆ λ ( γ C n λ ), therefore γ C n λ = ∆ λ ( γ C n λ ). The operator ∆ λ is λ -Lipschitz continuous [Tan18,Proposition 10.3.2(b)] thus, by Banach’s ﬁxed-point theorem, ∆ λ has a unique ﬁxed point.Hence, γ C λ = d λ .6.3. Undiscounted case.

For λ = 1, the termination condition of the simple policy-iteration algorithm of Section 6.1 is not suﬃcient to guarantee correctness, since Algorithm 1may terminate prematurely by returning a ﬁxed point of ∆ that is not the minimal one.Towards a way to obtain a stronger termination condition, we introduce the notion ofself-closed relations w.r.t. a ﬁxed point for ∆ , originally due to [Fu12]. Deﬁnition 6.7.

A relation M ⊆ S × S is self-closed w.r.t. d = ∆ ( d ) if, whenever s M t , • (cid:96) ( s ) = (cid:96) ( t ) and d ( s, t ) > • if s → µ and d ( s, t ) = min ν (cid:48) ∈ δ ( t ) K ( d )( µ, ν (cid:48) ) then there exists t → ν and ω ∈ Ω( µ, ν ) suchthat d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M , • if t → ν and d ( s, t ) = min µ (cid:48) ∈ δ ( s ) K ( d )( µ (cid:48) , ν ) then there exists s → µ and ω ∈ Ω( µ, ν ) suchthat d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M .Two states are self-closed w.r.t d , written s ≈ d t , if they are related by some self-closedrelation w.r.t. d .Let M be a self-closed set w.r.t d where d = γ C for some coupling structure C . In theprobabilistic automaton A C , a state pair ( s, t ) ∈ M can reach ω with supp ( ω ) ⊆ M . As wewill see in the proof of Lemma 6.8, this allows us to reduce all d ( s, t ) ( s,t ) ∈ M simultaneouslyby a small amount so that d still is a preﬁx point of ∆ .It can be easily shown that ≈ d is the largest self-closed relation w.r.t. d . Note that theconcept of self-closeness above is deﬁned only for ﬁxed points of ∆ . As remarked in [Fu12],the largest self-closed relation ≈ d can be computed in polynomial time by using partitionreﬁnement techniques similar to those employed to compute the largest bisimilarity relation.The next lemma states that if for a ﬁxed point d = ∆ ( d ) the relation ≈ d is nonempty,then d is not the least ﬁxed point of ∆ . Lemma 6.8.

Let d = ∆ ( d ) . If there exists a nonempty self-closed relation M w.r.t. d , thenthere exists d M (cid:60) d such that ∆ ( d M ) (cid:118) d M . Moreover, d M can be computed in polynomialtime in the size of the probabilistic automaton A .Proof. Let M be a nonempty self-closed relation w.r.t. d . For arbitrary s, t ∈ S , µ ∈ δ ( s ),and ν ∈ δ ( t ), deﬁne θ s ( µ, t ) = d ( s, t ) − min ν ∈ δ ( t ) K ( d )( µ, ν ) and θ t ( s, ν ) = d ( s, t ) − min µ ∈ δ ( s ) K ( d )( µ, ν ) . Note that, θ s ( µ, t ) and θ t ( s, ν ) are non-negative since d = ∆ ( d ). Let θ = min { θ , θ , θ } where • θ = min { θ s ( µ, t ) | ( s, t ) ∈ M ∧ µ ∈ δ ( s ) ∧ θ s ( µ, t ) > } ; • θ = min { θ t ( s, ν ) | ( s, t ) ∈ M ∧ ν ∈ δ ( t ) ∧ θ t ( s, ν ) > } ; • θ = min { d ( s, t ) | ( s, t ) ∈ M } ;where min ∅ = 1. Note that θ >

0, because M is a nonempty self-closed relation w.r.t. d .Therefore θ >

0. We deﬁne the map d M : S × S → [0 ,

1] as d M ( s, t ) = (cid:40) d ( s, t ) − θ if ( s, t ) ∈ Md ( s, t ) if ( s, t ) / ∈ M It is clear that d M is well-deﬁned. Moreover d M (cid:60) d because M is nonempty and θ > ( d M ) (cid:118) d M . Let s, t ∈ S . We consider two cases: • Assume ( s, t ) / ∈ M . Then∆ ( d M )( s, t ) ≤ ∆ ( d )( s, t ) (by d M (cid:118) d and ∆ monotone)= d ( s, t ) ( d = ∆ ( d ))= d M ( s, t ) (( s, t ) / ∈ M ) • Assume ( s, t ) ∈ M . Then (cid:96) ( s ) = (cid:96) ( t ). Let µ ∈ δ ( s ). We consider two subcases below:(1) If θ s ( µ, t ) > d M ( s, t ) = d ( s, t ) − θ (def. d M ) ≥ d ( s, t ) − θ s ( µ, t ) (0 < θ ≤ θ s ( µ, t ))= min ν ∈ δ ( t ) K ( d )( µ, ν ) (def. θ s ( µ, t )) ≥ min ν ∈ δ ( t ) K ( d M )( µ, ν ) ( d M (cid:118) d and K monotone)(2) If θ s ( µ, t ) = 0, then d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν ). Since M is self-closed w.r.t. d , thereexists ν (cid:48) ∈ δ ( t ) such that d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ), for some ω ∈ Ω( µ, ν (cid:48) ) suchthat supp ( ω ) ⊆ M . Thusmin ν ∈ δ ( t ) K ( d M )( µ, ν ) ≤ K ( d M )( µ, ν (cid:48) ) ( ν (cid:48) ∈ δ ( t ))= min ω (cid:48) ∈ Ω( µ,ν (cid:48) ) (cid:88) u,v ∈ S d M ( u, v ) · ω (cid:48) ( u, v ) (def. K ( d M )) ≤ (cid:88) u,v ∈ S d M ( u, v ) · ω ( u, v ) ( ω ∈ Ω( µ, ν (cid:48) ))= (cid:88) u,v ∈ M d M ( u, v ) · ω ( u, v ) ( supp ( ω ) ⊆ M )= (cid:88) u,v ∈ M ( d ( u, v ) − θ ) · ω ( u, v ) (def. d M )= (cid:16) (cid:88) u,v ∈ M d ( u, v ) · ω ( u, v ) (cid:17) − θ ( (cid:80) u,v ∈ M ω ( u, v ) = 1)= d ( s, t ) − θ ( d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ))= d M ( s, t ) (def. d M )So that, in both cases 1 and 2 we have d M ( s, t ) ≥ min ν ∈ δ ( t ) K ( d M )( µ, ν ). Since thisinequality holds for all µ ∈ δ ( s ), we have d M ( s, t ) ≥ max µ ∈ δ ( s ) min ν ∈ δ ( t ) K ( d M )( µ, ν ). OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 19

Symmetrically, we can prove d M ( s, t ) ≥ max ν ∈ δ ( t ) min µ ∈ δ ( s ) K ( d M )( µ, ν ). Thus, by deﬁni-tion of Hausdorﬀ lifting, d M ( s, t ) ≥ H ( K ( d M ))( δ ( s ) , δ ( t )). From this we conclude d M ( s, t ) ≥ H ( K ( d M ))( δ ( s ) , δ ( t ))= ∆ ( d M )( s, t ) ( (cid:96) ( s ) = (cid:96) ( t ) and def. ∆ )Finally, we consider the complexity of computing d M . For computing θ , we need to computein turn θ , θ , and θ . Since M ⊆ S × S , computing θ can be done in quadratic time in | S | . The computation of θ requires at most | M | · (cid:80) s ∈ S | δ ( s ) | solutions of a transportationproblem. This can be done in polynomial-time in the size of A . Similarly for θ .The proof of Lemma 6.8 is essentially that of [Fu12, Theorem 3]. Given a nonemptyself-closed relation M w.r.t. d , the above result can be used to obtain a preﬁx point of ∆ ,namely d M , that improves d towards the search of the least ﬁxed point. The preﬁx point d M of Lemma 6.8 is obtained from d by subtracting a suitable value θ > M : d M ( s, t ) = (cid:40) d ( s, t ) − θ if ( s, t ) ∈ M ,d ( s, t ) if ( s, t ) / ∈ M .

The value of θ that gives us the smallest preﬁx point deﬁned as above, is maximal valuesatisfying the following inequalities θ ≤ d ( s, t ) − min ν (cid:48) ∈ δ ( t ) K ( d )( µ, ν (cid:48) ) for all ( s, t ) ∈ M and µ ∈ δ ( s ), θ ≤ d ( s, t ) − min µ (cid:48) ∈ δ ( s ) K ( d )( µ (cid:48) , ν ) for all ( s, t ) ∈ M and ν ∈ δ ( t ), θ ≤ d ( s, t ) for all ( s, t ) ∈ M .

The fact that d M is a preﬁx point follows by the fact that M is a self-closed relation.The following lemma provides us with a termination condition for the simple policyiteration algorithm to compute d . Indeed, according to it, if d is a ﬁxed point of ∆ , wecan assert that d is equal to bisimilarity distance d by simply checking that the maximalself-closed relation w.r.t. d is empty. Lemma 6.9.

Let d = ∆ ( d ) . If ≈ d = ∅ , then d = d .Proof. Let d = ∆ ( d ). We proceed by contraposition. Assume that d (cid:54) = d . We deﬁne anon-empty self-closed relation M w.r.t. d as follows. m = max s,t ∈ S d ( s, t ) − d ( s, t ) , M = { ( s, t ) ∈ S × S | d ( s, t ) − d ( s, t ) = m } . Clearly, m > M (cid:54) = ∅ because d (cid:54) = d .Let ( s, t ) ∈ M . We prove that the three conditions of Deﬁnition 6.7 hold true.(1) d ( s, t ) > < m = d ( s, t ) − d ( s, t ) ≤ d ( s, t ). Now we prove that (cid:96) ( s ) = (cid:96) ( t ).Towards a contradiction, assume (cid:96) ( s ) (cid:54) = (cid:96) ( t ). Then, the following inequalities hold0 < m = d ( s, t ) − d ( s, t ) = ∆ ( d )( s, t ) − ∆ ( d )( s, t ) = 1 − , leading to the contradiction that 0 < (2) Let µ ∈ δ ( s ) such that d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν ). Then we have d ( s, t ) = ∆ ( d )( s, t ) (by def. d )= H ( K ( d ))( δ ( s ) , δ ( t )) (by (cid:96) ( s ) = (cid:96) ( t )) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) (by def. H )Let ν ∗ ∈ δ ( t ), ω ∈ Ω( µ, ν ∗ ) such thatmin ν ∈ δ ( t ) K ( d )( µ, ν ) = K ( d )( µ, ν ∗ ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) . (6.5)Then, the following inequalities hold K ( d )( µ, ν ∗ ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (by (6.5))= (cid:80) u,v ∈ S (cid:0) d ( u, v ) − ( d ( u, v ) − d ( u, v )) (cid:1) · ω ( u, v ) ≥ (cid:80) u,v ∈ S ( d ( u, v ) − m ) · ω ( u, v ) (def. m )= (cid:0) (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (cid:1) − m ( ω ∈ D ( S × S )) ≥ K ( d )( µ, ν ∗ ) − m (def. K and ω ∈ Ω( µ, ν ∗ ))Thus, we have d ( s, t ) ≤ K ( d )( µ, ν ∗ ) ( d ( s, t ) = min ν ∈ δ ( t ) K ( d )( µ, ν )) ≤ K ( d )( µ, ν ∗ ) + m ( K ( d )( µ, ν ∗ ) ≥ K ( d )( µ, ν ∗ ) − m ) ≤ d ( s, t ) + m ( d ( s, t ) ≥ min ν ∈ δ ( t ) K ( d )( µ, ν ) and def. ν ∗ )= d ( s, t ) + ( d ( s, t ) − d ( s, t )) (def. m and ( s, t ) ∈ M )= d ( s, t )Therefore, all the above inequalities are, in fact, equalities. Hence, d ( s, t ) = K ( d )( µ, ν ∗ ) and d ( s, t ) = K ( d )( µ, ν ∗ ) . (6.6)We conclude by proving that ω satisﬁes the following d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) and supp ( ω ) ⊆ M .

This can be observed as follows (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) ≥ d ( s, t ) ( ω ∈ Ω( µ, ν ∗ ) and (6.6))= d ( s, t ) + m (( s, t ) ∈ M and def. m )= (cid:0) (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (cid:1) + m (by (6.6) and (6.5))= (cid:80) u,v ∈ S ( d ( u, v ) + m ) · ω ( u, v ) ( ω ∈ D ( S × S )) ≥ (cid:80) u,v ∈ S (cid:0) d ( u, v ) + ( d ( u, v ) − d ( u, v )) (cid:1) · ω ( u, v ) (def. m )= (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )Hence, the above are in fact equalities and in particular d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ).Consider now the following inequalities m = d ( s, t ) − d ( s, t ) (( s, t ) ∈ M )= d ( s, t ) − (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v ) (by (6.6) and (6.5))= (cid:80) u,v ∈ S (cid:0) d ( u, v ) − d ( u, v ) (cid:1) · ω ( u, v ) ( d ( s, t ) = (cid:80) u,v ∈ S d ( u, v ) · ω ( u, v )) OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 21

Algorithm 2:

Simple policy iteration algorithm computing d . Initialise C = ( f, ρ ) as an arbitrary vertex coupling structure for A isMin ← false while ¬ isMin do while ∃ ( s, t ) . ∆ ( γ C )( s, t ) < γ C ( s, t ) do R ← h ( γ C )( s, t ) C ← (cid:0) k ( γ C ) , ρ [( s, t ) /R ] (cid:1) /* update coupling structure */ end Let M ← ≈ γ C /* note that γ C = ∆ ( γ C ) */ if M = ∅ then isMin ← true /* γ C = d */ else Compute d = ( γ C ) M as in Lemma 6.8 Re-initialise C as a vertex coupling structure s.t. Γ C ( d ) = ∆ ( d ) end end return γ C Since d ( u, v ) − d ( u, v ) ≤ m for all u, v ∈ S , the above equalities imply that whenever ω ( u, v ) > d ( u, v ) − d ( u, v ) = m . Therefore supp ( ω ) ⊆ M .(3) Can be argued symmetrically to the previous case.Therefore, M is a nonempty self-closed relation with respect to d .Algorithm 2 extends the procedure described in Section 6.1 by encapsulating the policyiteration update (lines 4–7) into an outer-loop (lines 3–15) that is responsible to check whetherthe ﬁxed point γ C i returned is the minimal one. According to Lemma 5.4, ∆ ( γ C i ) (cid:118) γ C i .Hence, when we reach line 8, we have that ∆ ( γ C i ) = γ C i . Therefore, by Lemmas 6.8 and 6.9, γ C i = d if and only if M = ≈ γ C i is empty. If M is empty, we set the variable isMin to true (line 10) causing the outer-loop to terminate. Otherwise, we construct d = ( γ C i ) M as inLemma 6.8 (line 12) and re-start the inner-loop from a vertex coupling structure C i +1 suchthat Γ C i +1 ( d ) = ∆ ( d ) (line 13) ( e.g. , by using C i +1 = ( k ( d ) , ρ ) where ρ ( s, t ) = h ( d )( s, t ) forall s, t ∈ S ). As proven in Theorem 6.10, γ C i (cid:61) γ C i +1 . This guarantees a strict improvementof the discrepancy towards the minimal one. Termination of Algorithm 2 is justiﬁed bysimilar arguments as for the discounted case. Theorem 6.10.

Algorithm 2 terminates and computes d .Proof. First we prove termination. Recall that { γ C | C vertex coupling structure for A} is ﬁnite. Towards a contradiction, assume that Algorithm 2 does not terminate. Let C , C , C , . . . be the inﬁnite sequence of coupling structures updates generated during anon-terminating execution of Algorithm 2. Since the set above is ﬁnite, there must be i < j such that γ C i = γ C j . On the contrary, next we prove that the updates of the couplingstructures in Algorithm 2 ensure that for all n ∈ N , γ C n (cid:61) γ C n +1 . Let n ∈ N . We considertwo cases: • Assume ∆ ( γ C n )( s, t ) < γ C n ( s, t ), for some s, t ∈ S . In this case C n +1 is obtained from theupdate performed in line 6, which is exactly the one prescribed by Lemma 6.3. Thereforewe have that γ C n λ (cid:61) γ C n +1 λ . • Assume ∆ ( γ C n )( s, t ) ≥ γ C n ( s, t ) for all s, t ∈ S , i.e., γ C n (cid:118) ∆ ( γ C n ). By Lemma 5.4 γ C n (cid:119) ∆ ( γ C n ), therefore γ C n = ∆ ( γ C n ). In this case C n +1 is constructed as a vertexcoupling structure such that Γ C n +1 ( d ) = ∆ ( d ) where M = ≈ γ C n (cid:54) = ∅ and d = ( γ C n ) M (seeline 13). Then the following inequalities hold γ C n (cid:61) d (cid:119) ∆ ( d ) (Lemma 6.8)= Γ C n +1 ( d ) (by construction) (cid:119) γ C n +1 (by Γ C n +1 ( d ) (cid:118) d and Knaster-Tarski ﬁxed point theorem)This concludes the proof that γ C n (cid:61) γ C n +1 for all n ∈ N .When the execution of Algorithm 2 has reached line 10 we have that γ C = ∆ ( γ C ).Moreover, we have ≈ γ C = ∅ . Therefore, by Lemma 6.9 we have that γ C = d . Fromhere isMin is set to true . This prevents further executions of the body of the outer-loop.Therefore Algorithm 2 reached the return statement with γ C = d .6.4. Experimental Results.

In this section, we evaluate the performance of the simplepolicy iteration algorithms on a collection of randomly generated probabilistic automata.All the algorithms have been implemented in Java and the source code is publicly available .The performance of Algorithm 1 has been compared with an implementation of the value iteration algorithm proposed by Fu [Fu12, Section 4]. This algorithm works as follows.Starting from the bottom element, it iteratively applies ∆ λ to the current distance functiongenerating the increasing chain (cid:118) ∆ λ ( ) (cid:118) ∆ λ ( ) (cid:118) · · · (cid:118) ∆ k − λ ( ) (cid:118) ∆ kλ ( ).For each input instance, the comparison involves the following steps:(1) We run Algorithm 1, storing execution time, the number of solved transportationproblems, and the number of coupling structures generated during the execution (i.e.,the number of times a λ -discrepancy has been computed);(2) Then, on the same instance, we execute the value iteration algorithm until the runningtime exceeds that of step 1. We report the execution time, the number of solvedtransportation problems, and the number of iterations.(3) Finally, we report the error max s,t ∈ S | d λ ( s, t ) − d ( s, t ) | between the distance d λ computedin step 1 and the approximate result d obtained in step 2.This has been done for a collection of automata varying from 10 to 50 states. For each n = 10 , . . . ,

50, we considered 100 randomly generated probabilistic automata, varyingprobabilistic out-degree and nondeterministic out-degree. Table 1 reports the average resultsof the comparison. Our algorithm is able to compute the solution before value iteration canunder-approximate it with an error ranging from 0 .

004 to 0 .

06 which is a non negligibleerror considering that we ﬁxed λ = 0 . , https://bitbucket.org/discoveri/probabilistic-bisimilarity-distances-probabilistic-automata . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 23 n = | S | Simple Policy Iteration Value Iteration Errortime (sec) C time (sec) n = 10 ..

50, nondeterministic out-degree k = 1 ..

3, and probabilistic out-degree p = 2 ..

3. Discount λ = 0 .

8; accuracy 0 . T i m e ( S e c o nd s ) p=2,k=1 p=2,k=2 p=2,k=3 p=3,k=1 p=3,k=2 p=3,k=3 Figure 3: Average performance for the Simple Policy Iteration Algorithm conducted on 100randomly generated automata varying number of states n = 10 ..

50, nondetermin-istic out-degree k = 1 ..

3, and probabilistic out-degree p = 2 ..

3. Discount factor λ = 0 .

8; accuracy 0 . λ -discrepancy ( cf. Remark 5.3) which has O ( n k ) variables and O ( n k )constraints where n and k are the number of states and the nondeterministic out-degree ofthe automaton, respectively.Algorithm 2 extends the simple policy iteration algorithms proposed in [BBLM13,TvB16] for labelled Markov chains. As pointed out in [Tan18], implementations based onthe decision procedure for the existential fragment of the ﬁrst-order theory of the realsstruggle to handle labelled Markov chains with a ﬁfty states. For probabilistic automata,the algorithms in [CdAMR08, CdAMR10] suﬀer from the same problem. The performanceof Algorithm 2 is comparable to that of Algorithm 1 ( cf. Table 2). Despite the fact that thesimple policy algorithm is not guaranteed to be sound when the discount factor equals one, n = | S | time (sec) C n = 10 ..

50, nondeterministic out-degree k = 1 .. p = 2 ..

3. Discount λ = 1; accuracy 0 . Relation with Probabilistic Model Checking

In this section we show how the probabilistic bisimilarity distance of Deng et al. relatesto the problem of model checking ω -regular speciﬁcations against probabilistic automata,where the nondeterministic choices are resolved by randomized schedulers.Probabilistic automata are used for the veriﬁcation of concurrent probabilistic systems,where the choice of how to interleave the executions of the parallel components is modelledby means of nondeterminism in the choice of the next transition to be taken. Technically, anexecution of a probabilistic automaton A = ( S, L, → , (cid:96) ) is an inﬁnite sequence s s . . . ∈ S ω of labelled states obtained by taking a succession of probabilistic transitions s i → µ i such that µ i ( s i +1 ) >

0, for each i ∈ N . The choice of the transition to be taken at each state is resolvedby means of a scheduler. In this paper we consider randomized schedulers , i.e., functions π : S + → D ( S ) mapping a nonempty and ﬁnite sequence of states s . . . s n ∈ S + (theexecution history) to a convex combination of distributions of the form (cid:80) s n → µ α µ · µ ∈ D ( S ),for some α µ ∈ [0 ,

1] such that (cid:80) s n → µ α µ = 1. Roughly, a randomized scheduler decides theprobability with which the next transition is chosen given the history of visited states.The combination of a probabilistic automata A with a scheduler π induces a Markovchain on a random variable X = ( X , X , . . . ) ∈ S ω on the measurable space of inﬁnitesequences over S with distributionPr πs ( X = s , . . . , X n = s n ) = s ( s ) · n − (cid:89) i =0 π ( s . . . s i )( s i +1 ) , where x denotes the Dirac distribution concentrated at x . The above describes theprobability of executing the sequence of steps s . . . s n by starting from the state s underthe randomized scheduler π . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 25

The measurable sets of S ω are the elements of the inﬁnite product σ -algebra (2 S ) ω , i.e.,the smallest σ -algebra containing the subsets of the form s . . . s n S ω = { s . . . s n w | w ∈ S ω } (a.k.a., discrete cylinders ), for arbitrary n ∈ N , s i ∈ S and 0 ≤ i ≤ n . Measurable sets arethe subsets of sequences where the probability measure Pr πs is well-deﬁned. For a measurableset H ⊆ S ω , we denote by Pr πs ( X ∈ H ) the probability that an execution starting from s under the scheduler π belongs to H .Rather than measuring the probability of concrete executions over S ω , one is often moreinterested in the probability that certain execution traces satisfy abstract properties overthe measurable space L ω of inﬁnite sequences of labels, representing the sequences of atomicproperties satisﬁed by concrete executions of the automaton.Formally, for a measurable set E ⊆ L ω , we denote by Pr πs ( (cid:96) ( X ) ∈ E ) the probabilitythat an execution generates a sequence of labels in E , where (cid:96) ( X ) = ( (cid:96) ( X ) , (cid:96) ( X ) , . . . ) ∈ L ω is the random variable induced from X by the labelling function (cid:96) . The σ -algebra of L ω contains all the ω -regular languages expressible over the alphabet L [BK08, Chapter 10].This means that the probability of the runs of A of satisfying ω -regular properties, possiblyexpressed in the form of LTL formulas, can be formally measured by Pr πs , hence allowingthe quantitative analysis of probabilistic automata.The quantitative analysis of a probabilistic automaton A against ω -regular speciﬁcations,more commonly known as probabilistic model-checking , amounts to establishing the maximaland minimal probability of satisfying ω -regular properties E ⊆ L ω over inﬁnite sequences oflabels from a starting state s . Formally, this corresponds to computingMax s ( E ) = sup π ∈ Π Pr πs ( (cid:96) ( X ) ∈ E ) and Min s ( E ) = inf π ∈ Π Pr πs ( (cid:96) ( X ) ∈ E )where the inﬁmum and supremum are taken over the set Π of all randomized schedulers.Note that, considering minimal or maximal probabilities corresponds to a worst/best-caseanalysis (see [BK08, Chapter 10] for more details).The following is the main result of the section. It states that the probabilistic bisimilaritydistance bounds the diﬀerence between maximal and minimal probability of satisfying anymeasurable linear-time property ( e.g. , ω -regular speciﬁcations) on two given initial states. Theorem 7.1.

For all measurable subsets E ⊆ L ω , | Max s ( E ) − Max t ( E ) | ≤ d ( s, t ) and | Min s ( E ) − Min t ( E ) | ≤ d ( s, t ) . The above can be seen as a quantitative generalization of the folklore result that bisimilarstates satisfy the same linear-time properties with the same probability.

Remark 7.2.

The relevance of Theorem 7.1 is not just theoretical, but could possibly leadto signiﬁcant practical applications. Imagine that the distance d ( s, t ) between some givenstates s and t is small (and known); then, computing Max s ( E ) (resp. Min s ( E )) in the state s may be enough for obtaining a good approximation for the actual value of Max t ( E ) (resp.Min t ( E )) without the need of computing it on the state t . This approach may lead tosavings in the overall model-checking time of t , especially if the executions starting from s have a signiﬁcant reduced degree of nondeterminism than whose starting from t .The proof of Theorem 7.1 is based on the coupling characterisation of the bisimilaritydistance presented in Theorem 5.7 and the following technical lemma (Lemma 7.3) whichestablishes under which conditions the discrepancy associated with a coupling structure canbe used to bound the variational distance between the probability induced by a probabilistic automaton A under two diﬀerent schedulers. Speciﬁcally, we establish how, from a couplingstructure C , one can retrieve a set-coupling R C ∈ R (Π , Π) of schedulers for A such that foreach pair of schedulers ( π, π (cid:48) ) for A in R C , the diﬀerence | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | for any measurable E ⊆ L ω , is bounded by the discrepancy γ C ( s, t ).The deﬁnition of R C can be intuitively understood by recalling that γ C ( s, t ) correspondsto the maximal probability of reaching a pair of states with diﬀerent labels from the state pair( s, t ) in the automaton A C induced from C . Roughly, R C collects all the pairs of schedulersfor A obtained as the left and right projection of a scheduler for A C . How the projections aredeﬁned is technical and the interested reader can found the formal deﬁnition in the proof. Lemma 7.3.

For any coupling structure C for A and s, t ∈ S , exists R C ∈ R (Π , Π) such that,for all measurable E ⊆ L ω and ( π, π (cid:48) ) ∈ R C , | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ γ C ( s, t ) .Proof. Fix s, t ∈ S and C = ( f, ρ ) a coupling structure for A . Let A C be the automatonassociated with the coupling structure C .We split the proof in two parts. (Part 1) deals with the deﬁnition of the set-coupling R C ∈ R (Π , Π); (Part 2) with proving that | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ γ C ( s, t ), forall pairs of randomized schedulers ( π, π (cid:48) ) ∈ R C .Hereafter, for a nonempty ﬁnite sequence σ ∈ S + and a random variable X on S ω , weuse X ≺ σ to denote X ∈ σS ω . Part 1:

Let (

X, Y ) ∈ S ω × S ω be the random variable describing the inﬁnite sequence ofstate pairs along a run of A C . Then, for any two nonempty ﬁnite sequences of the samelength σ = σ . . . σ n and τ = τ . . . τ n over S ,Pr π ( s,t ) (( X, Y ) ≺ (cid:104) σ, τ (cid:105) ) = Pr π ( s,t ) ( X ≺ σ, Y ≺ τ )is the probability that, starting from ( s, t ), a run of A C under the scheduler π haspreﬁx (cid:104) σ, τ (cid:105) = ( σ , τ ) . . . ( σ n , τ n ). The above can be alternatively formulated in terms ofconditional probabilities in the following two ways:Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) = Pr π ( s,t ) ( X ≺ σ ) · Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) , (7.1)Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) = Pr π ( s,t ) ( Y ≺ τ ) · Pr π ( s,t ) ( X ≺ σ | Y ≺ τ ) . (7.2)Given a scheduler π for A C , we deﬁne the maps π S , π T : S + → D ( S ) as follows , forarbitrary nonempty sequences σ, τ over Sπ S ( σ )( u ) = (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) , (7.3) π T ( τ )( u ) = (cid:88) σ ∈ S | τ | (cid:16) Pr π ( s,t ) ( X ≺ σ | Y ≺ τ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) . (7.4)We call π S and π T , the left and right projections of π .Intuitively, π S ( σ )( u ) describes the probability that under the scheduler π a run of A C with initial state ( s, t ) has a preﬁx of the form (cid:104) σu, τ (cid:48) (cid:105) , for some τ (cid:48) ∈ S | σ | +1 ; symmetrically, π T ( τ )( u ) is the probability that the preﬁx is of the form (cid:104) σ (cid:48) , τ u (cid:105) , for some σ (cid:48) ∈ S | τ | +1 .Next, we prove that π S and π T are well-deﬁned schedulers for A . We provide the proofonly for π S , as it is similar for π T . We need to show that π S ( σ ) is a convex combination ofthe form (cid:80) µ ∈ δ ( σ n ) α µ · µ , for some α µ ∈ [0 ,

1] such that (cid:80) µ ∈ δ ( σ n ) α µ = 1. By hypothesis We assume that Pr π ( s,t ) ( E | F ) = 0, whenever Pr π ( s,t ) ( F ) = 0 for any two measurable events E, F . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 27 that π is a scheduler for A C , we have that π ( (cid:104) σ, τ (cid:105) ) = (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · f ( µ, ν ), forsome ξ µ,ν ∈ [0 , (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν = 1. Without loss of generality, we canassume that ξ µ,ν = 0, whenever ( µ, ν ) / ∈ ρ ( σ n , τ n ). Let κ σ,τ = Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ),then π S ( σ )( u ) = (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · f ( µ, ν )( u, v ) (cid:17) ( π scheduler)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) = (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν · µ ( u ) (cid:17) ( f measure-coupling map)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ µ,ν · µ ( u ) (cid:17) ( ρ set-coupling map)= (cid:88) µ ∈ δ ( σ n ) (cid:16) (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ν ∈ δ ( τ n ) ξ µ,ν (cid:17) · µ ( u )By letting α µ = (cid:80) τ ∈ S | σ | κ σ,τ (cid:80) ν ∈ δ ( τ n ) ξ µ,ν , we get π S ( σ ) = (cid:80) µ ∈ δ ( σ n ) α µ · µ in the desiredform. Next we show that this is a convex combination, i.e., (cid:80) µ ∈ δ ( σ n ) α µ = 1. (cid:88) µ ∈ δ ( σ n ) α µ = (cid:88) µ ∈ δ ( σ n ) (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ν ∈ δ ( τ n ) ξ µ,ν (def. α µ )= (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ µ,ν = (cid:88) τ ∈ S | σ | κ σ,τ (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν ( ρ set-coupling map)= (cid:88) τ ∈ S | σ | κ σ,τ ( (cid:80) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ µ,ν = 1)= (cid:88) τ ∈ S | σ | Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (def. κ σ,τ )= 1 (probability)So, π S and π T are well-deﬁned schedulers for A . Given the above, we deﬁne therelation R C ⊆ Π × Π on schedulers for A by R C = { ( π S , π T ) | π scheduler on A C } . To better understand the deﬁnition of R C , recall that A C can be interpreted as theautomaton describing the concurrent execution of two copies of A synchronised accordingto C . Then, π S and π T can be interpreted as the schedulers on obtained from π , byrespectively taking the left and right projections of the executions of A C as computationson A . The relation R C is given as the collection of these pair of projections. Next we prove that R C is a set-coupling for (Π , Π), that is { π | ∃ π ∈ Π . ( π , π ) ∈ R C } = Π and { π | ∃ π ∈ Π . ( π , π ) ∈ R C } = Π . By deﬁnition of R C , this is equivalent to prove that for an arbitrary pair of schedulers π S , π T for A we can ﬁnd a scheduler π for A C such that (7.3), (7.4) hold (hence,( π S , π T ) ∈ R C ).Let σ = σ . . . σ n and τ = τ . . . τ n be a pair of nonempty ﬁnite sequences of the samelength over S , and assume π S ( σ ) = (cid:80) µ ∈ δ ( σ n ) α σµ · µ and π T ( τ ) = (cid:80) ν ∈ δ ( τ n ) β τν · ν , for some α σµ , β τν ∈ [0 ,

1] such that (cid:80) µ ∈ δ ( σ n ) α σµ = 1 and (cid:80) ν ∈ δ ( τ n ) β τν = 1. We deﬁne π ( (cid:104) σ, τ (cid:105) ) = (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν · f ( µ, ν ) where ξ σ,τµ,ν = (cid:40) α σµ · β τν if ( µ, ν ) ∈ ρ ( σ n , τ n )0 otherwise . By the fact that ρ ( σ n , τ n ) is a set-coupling in R ( δ ( σ n ) , δ ( τ n )) and the deﬁnition of ξ σ,τµ,ν , it is easy to see that for all µ ∈ δ ( σ n ) and ν ∈ δ ( τ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν = α σµ and (cid:88) µ ∈ δ ( σ n ) ξ σ,τµ,ν = β τν . (7.5)Next we show that (7.3) holds. Let κ σ,τ = Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ), then π S ( σ )( u )= (cid:88) τ ∈ S | σ | κ σ,τ π S ( σ )( u ) (convex combination)= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) α σµ · µ ( u ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν · µ ( u ) (cid:17) (by (7.5))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) µ ∈ δ ( σ n ) (cid:88) ν ∈ δ ( τ n ) ξ σ,τµ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) ( f ( µ, ν ) ∈ Ω( µ, ν ))= (cid:88) τ ∈ S | σ | κ σ,τ (cid:16) (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν (cid:88) v ∈ S f ( µ, ν )( u, v ) (cid:17) ( ρ set-coupl. map)= (cid:88) τ ∈ S | σ | (cid:16) κ σ,τ (cid:88) v ∈ S (cid:88) ( µ,ν ) ∈ ρ ( σ n ,τ n ) ξ σ,τµ,ν · f ( µ, ν )( u, v ) (cid:17) = (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) (cid:88) v ∈ S π ( (cid:104) σ, τ (cid:105) )( u, v ) (cid:17) (def. κ σ,τ and π )Equation (7.4) is proven symmetrically. Part 2:

We prove Pr π ( s,t ) ∈ Ω(Pr π S s , Pr π T t ) ﬁrst. Showing the marginal conditions correspondsto prove that, for all nonempty sequences σ = σ . . . σ n and τ = τ . . . τ n over S ,Pr π S s ( X ≺ σ ) = Pr π ( s,t ) ( X ≺ σ ) and Pr π T t ( Y ≺ τ ) = Pr π ( s,t ) ( Y ≺ τ ) . We prove only the equality on the left, as the other is similar. We proceed by inductionon n ≥ OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 29 • Base case, n = 0. Then σ = σ ∈ S andPr π S s ( X ≺ σ ) = s ( σ ) (def. Pr π S s )= (cid:88) τ ∈ S s ( σ ) · t ( τ ) (convex combination)= (cid:88) τ ∈ S ( s,t ) ( σ , τ ) (def. ( s,t ) )= (cid:88) τ ∈ S Pr π ( s,t ) ( X ≺ σ , Y ≺ τ ) (def. Pr π ( s,t ) )= Pr π ( s,t ) ( X ≺ σ ) . (additivity) • Inductive step, n ≥

0. Let σ = σ . . . σ n and s (cid:48) ∈ S , thenPr π S s ( X ≺ σs (cid:48) )= Pr π S s ( X ≺ σ ) · π S ( σ )( s (cid:48) ) (def. Pr π S s )= Pr π ( s,t ) ( X ≺ σ ) · π S ( σ )( s (cid:48) ) (inductive hp)= Pr π ( s,t ) ( X ≺ σ ) · (cid:88) τ ∈ S | σ | (cid:16) Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) · (cid:88) t (cid:48) ∈ S π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) ) (cid:17) (eq. (7.3))= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σ ) · Pr π ( s,t ) ( Y ≺ τ | X ≺ σ ) · π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) )= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σ, Y ≺ τ ) · π ( (cid:104) σ, τ (cid:105) )( s (cid:48) , t (cid:48) ) (by (7.1))= (cid:88) τ ∈ S | σ | (cid:88) t (cid:48) ∈ S Pr π ( s,t ) ( X ≺ σs (cid:48) , Y ≺ τ t (cid:48) ) (def. Pr π ( s,t ) )= Pr π ( s,t ) ( X ≺ σs (cid:48) ) . (additivity)The right-marginal condition is proven symmetrically.Note that the discrepancy γ C ( s, t ) is the maximal probability of reaching a state pair( u, v ) such that (cid:96) ( u ) (cid:54) = (cid:96) ( v ) by starting from the state pair ( s, t ) in A C . That is, γ C ( s, t ) = sup π Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) , (7.6)where π ranges over all schedulers for A C . Thus, from Pr π ( s,t ) ∈ Ω(Pr π S s , Pr π T t ) we havePr π S s ( E ) = Pr π ( s,t ) ( (cid:96) ( X ) ∈ E ) ≥ Pr π ( s,t ) ( (cid:96) ( X ) = (cid:96) ( Y ) , (cid:96) ( Y ) ∈ E )= 1 − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y ) ∪ (cid:96) ( Y ) (cid:54)∈ E ) ≥ − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) − Pr π ( s,t ) ( (cid:96) ( Y ) (cid:54)∈ E )= Pr π ( s,t ) ( (cid:96) ( Y ) ∈ E ) − Pr π ( s,t ) ( (cid:96) ( X ) (cid:54) = (cid:96) ( Y )) ≥ Pr π ( s,t ) ( (cid:96) ( Y ) ∈ E ) − γ C ( s, t )= Pr π T ( s,t ) ( E ) − γ C ( s, t ) . From the above, we conclude that | Pr π S s ( (cid:96) ( X ) ∈ E ) − Pr π T t ( (cid:96) ( Y ) ∈ E ) | ≤ γ C ( s, t ). The proofs follows immediately by combining Part 1 and 2.Given Lemma 7.3 it is easy to establish the result stated in Theorem 7.1.

Proof of Theorem 7.1.

Let d R ( x, y ) = | x − y | denote the Euclidean distance on the real lineand deﬁne K = { Pr πs ( (cid:96) ( X ) ∈ E ) | π ∈ Π } and H = { Pr πt ( (cid:96) ( X ) ∈ E ) | π ∈ Π } .Then, by [M´em11, Lemma 3.2] H ( d R )( K, H ) ≥ max {| sup K − sup H | , | inf K − inf H |} = max {| Max s ( E ) − Max t ( E ) | , | Min s ( E ) − Min t ( E ) |} . Next we show that d ( s, t ) ≥ H ( d R )( K, H ). H ( d R )( K, H )= inf (cid:40) sup ( π,π (cid:48) ) ∈ R | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | (cid:12)(cid:12)(cid:12) R ∈ R (Π , Π) (cid:41) (Theorem 2.1) ≤ inf { γ C | C coupling structure for A } (Lemma 7.3)= d ( s, t ) . (Theorem 5.7)Therefore, | Max s ( E ) − Max t ( E ) | ≤ d ( s, t ) and | Min s ( E ) − Min t ( E ) | ≤ d ( s, t ).Another consequence of Lemma 7.3 is that the bisimilarity distance provides an upperbound of the Hausdorﬀ lifting of the variational distance between sets of distributionsinduced by the Markov chains obtained by ranging over all possible schedulers. In thetheorem we use T V to denote the total variation distance between probability measures,deﬁned as

T V ( µ, ν ) = sup E | µ ( E ) − ν ( E ) | , where E ranges over all measurable subsets. Theorem 7.4. H ( T V )( { Pr πs ( (cid:96) ( X ) ∈ · ) | π ∈ Π } , { Pr πt ( (cid:96) ( X ) ∈ · ) | π ∈ Π } ) ≤ d ( s, t ) .Proof. H ( T V )( { Pr πs ( (cid:96) ( X ) ∈ · ) | π ∈ Π } , { Pr πt ( (cid:96) ( X ) ∈ · ) | π ∈ Π } ) == inf (cid:40) sup ( π,π (cid:48) ) ∈ R T V (Pr πs ( (cid:96) ( X ) ∈ · ) , Pr π (cid:48) t ( (cid:96) ( X ) ∈ · )) | R ∈ R (Π , Π) (cid:41) (Theorem 2.1)= inf (cid:40) sup ( π,π (cid:48) ) ∈ R sup E | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | (cid:12)(cid:12)(cid:12) R ∈ R (Π , Π) (cid:41) (def. T V ) ≤ inf { γ C | C coupling structure for A } (Lemma 7.3)= d ( s, t ) . (Theorem 5.7)Theorem 7.4 can be alternatively stated as follows. For any scheduler π there existsa scheduler π (cid:48) such that | Pr πs ( (cid:96) ( X ) ∈ E ) − Pr π (cid:48) t ( (cid:96) ( X ) ∈ E ) | ≤ d ( s, t ), for all measurablesubsets E ⊆ L ω . OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 31 Conclusion and Future Work

We presented a novel characterization of the probabilistic bisimilarity distance of Deng etal. [DCPP06] as the solution of a simple stochastic game. Starting from it, we designedalgorithms for computing the distances based on Condon’s simple policy iteration algorithm.The correctness of Condon’s approach relies on the assumption that the input game isstopping. This may not be the case for our probabilistic bisimilarity games when thediscount factor is one. We overcame this problem by means of an improved terminationcondition based on the notion of self-closed relation due to Fu [Fu12].As in [TvB16], our simple policy iteration algorithm has exponential worst-case timecomplexity. Nevertheless, experiments show that our method can compete in practice withthe value iteration algorithm by Fu [Fu12] which has theoretical polynomial-time complexityfor λ <

1. To the best of our knowledge, our algorithm is the ﬁrst practical solution forcomputing the bisimilarity distance when λ = 1, performing orders of magnitude fasterthan the existing solutions based on the existential fragment of the ﬁrst-order theory of thereals [CdAMR08, CdAMR10, CHL07].As future work, we plan to improve upon the current implementation in the lineof [TvB18a], by exploiting the fact that bisimilar states and probabilistic distance one [TvB18b]can be eﬃciently pre-computed before starting the policy iteration. We believe that thiswould yield a signiﬁcant cut down in the time required to compute the discrepancy at eachiteration which turned out to be the bottleneck of our algorithms.More eﬃcient algorithms might lead to the speedup of veriﬁcation tools for concurrentprobabilistic systems, as behavioral distances relate to the satisﬁability of logical properties.For the case of labelled Markov chains, in [CvBW12, BBLM15] the variational diﬀerencebetween two states with respect to their probability of satisfying linear-time properties (eg.,LTL formulas) is shown to be bound by the (undiscounted) probabilistic bisimilarity distance.In Section 7 we showed that a similar result holds for the case of probabilistic automatawith additional subtleties that arise by the need of handling the nondeterminism. In light ofthis relation it would be interesting to develop approximated techniques to cut down theoverall model-checking time of probabilistic automata as brieﬂy discussed in Remark 7.2.We also plan to extend the work on approximated minimization [BBLM17, BBLM18]to the case of probabilistic automata and explore the possible relation between the prob-abilistic bisimilarity distance with more expressive logics for concurrent probabilistic sys-tems [CdAMR08, CdAMR10, Mio12]. References [BBL +

19] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare, Qiyi Tang, and Franck vanBreugel. Computing Probabilistic Bisimilarity Distances for Probabilistic Automata. In , volume 140 of

LIPIcs , pages9:1–9:17. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik, 2019.[BBLM13] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. On-the-Fly Exact Computa-tion of Bisimilarity Distances. In , volume 7795 of

LNCS , pages 1–15, 2013.[BBLM15] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. Converging from Branchingto Linear Metrics on Markov Chains. In , volume 9399 of

Lecture Notes in Computer Science , pages 349–367.Springer, 2015. [BBLM17] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the Metric-BasedApproximate Minimization of Markov Chains. In , volume 80 of

LIPIcs , pages 104:1–104:14. SchlossDagstuhl - Leibniz-Zentrum fuer Informatik, 2017.[BBLM18] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the metric-basedapproximate minimization of Markov Chains.

J. Log. Algebr. Meth. Program. , 100:36–56, 2018.[BCD +

11] Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovi´c, TimKing, Andrew Reynolds, and Cesare Tinelli. CVC4. In Ganesh Gopalakrishnan and Shaz Qadeer,editors, , volume 6806of

Lecture Notes in Computer Science , pages 171–177. Springer, 2011.[BK08] Christel Baier and Joost-Pieter Katoen.

Principles of Model Checking . MIT Press, 2008.[CdAMR08] Krishnendu Chatterjee, Luca de Alfaro, Rupak Majumdar, and Vishwanath Raman. Algorithmsfor Game Metrics. In

IARCS Annual Conference on Foundations of Software Technology andTheoretical Computer Science, FSTTCS 2008 , volume 2 of

LIPIcs , pages 107–118. SchlossDagstuhl - Leibniz-Zentrum fuer Informatik, 2008.[CdAMR10] Krishnendu Chatterjee, Luca de Alfaro, Rupak Majumdar, and Vishwanath Raman. Algorithmsfor Game Metrics (Full Version).

Logical Methods in Computer Science , 6(3), 2010.[CHL07] Taolue Chen, Tingting Han, and Jian Lu. On Behavioral Metric for Probabilistic Systems:Deﬁnition and Approximation Algorithm. In , pages 21–25. IEEE Computer Society, 2007.[Con90] Anne Condon. On Algorithms for Simple Stochastic Games. In

Advances In ComputationalComplexity Theory , volume 13 of

DIMACS Series in Discrete Mathematics and TheoreticalComputer Science , pages 51–72. DIMACS/AMS, 1990.[Con92] Anne Condon. The Complexity of Stochastic Games.

Inf. Comput. , 96(2):203–224, 1992.[CvBW12] Di Chen, Franck van Breugel, and James Worrell. On the Complexity of Computing Proba-bilistic Bisimilarity. In , volume 7213 of

Lecture Notes in Computer Science ,pages 437–451. Springer, 2012.[DCPP06] Yuxin Deng, Tom Chothia, Catuscia Palamidessi, and Jun Pang. Metrics for Action-labelledQuantitative Transition Systems.

Electr. Notes Theor. Comput. Sci. , 153(2):79–96, 2006.[DGJP04] Josee Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics forLabelled Markov Processes.

Theor. Comput. Sci. , 318(3):323–354, 2004.[DLT08] Jos´ee Desharnais, Fran¸cois Laviolette, and Mathieu Tracol. Approximate Analysis of ProbabilisticProcesses: Logic, Simulation and Games. In , pages 264–273. IEEE Computer Society, 2008.[FKP17] Nathana¨el Fijalkow, Bartek Klin, and Prakash Panangaden. Expressiveness of ProbabilisticModal Logics, Revisited. In , volume 80 of

LIPIcs , pages 105:1–105:12. Schloss Dagstuhl -Leibniz-Zentrum fuer Informatik, 2017.[Fu12] Hongfei Fu. Computing Game Metrics on Markov Decision Processes. In , volume 7392 of

LectureNotes in Computer Science , pages 227–238. Springer, 2012.[Fu14] Hongfei Fu.

Verifying Probabilistic Systems: New Algorithms and Complexity Results . PhDthesis, RWTH Aachen, Aachen, Germany, Nov 2014.[GJS90] Alessandro Giacalone, Chi-Chang Jou, and Scott A. Smolka. Algebraic Reasoning for Proba-bilistic Concurrent Systems. In

Proceedings of the IFIP WG 2.2/2.3 Working Conference onProgramming Concepts and Methods , pages 443–458. North-Holland, 1990.[Hau14] Felix Hausdorﬀ.

Grundz¨uge der Mengenlehre . Verlag Von Veit & Comp, Leipzig, 1914.[JL91] Bengt Jonsson and Kim G. Larsen. Speciﬁcation and Reﬁnement of Probabilistic Processes.In , pages 266–277. IEEEComputer Society, 1991.[Jub05] Brendan Juba. On the Hardness of Simple Stochastic Games. Master’s thesis, Carnegie MellonUniversity, Pittsburgh, PA, USA, May 2005.[Kan42] Leonid Vitalevich Kantorovich. On the transfer of masses (in Russian).

Doklady Akademii Nauk ,5(5-6):1–4, 1942. Translated in Management Science, 1958.

OMPUTING PROBABILISTIC BISIMILARITY DISTANCES FOR PROBABILISTIC AUTOMATA 33 [KM18] Barbara K¨onig and Christina Mika-Michalski. (Metric) Bisimulation Games and Real-ValuedModal Logics for Coalgebras. In , volume 118 of

LIPIcs , pages 37:1–37:17. Schloss Dagstuhl - Leibniz-Zentrum fuerInformatik, 2018.[KS95] Peter Kleinschmidt and Heinz Schannath. A Strongly Polynomial Algorithm for the Transporta-tion Problem.

Math. Program. , 68:1–13, 1995.[LL69] Thomas Liggett and Steven A. Lippman. Stochastic Games with Perfect Information and TimeAverage Payoﬀ.

SIAM Review , 11(4):604–607, 1969.[M´em11] Facundo M´emoli. Gromov-Wasserstein Distances and the Metric Approach to Object Matching.

Foundations of Computational Mathematics , 11(4):417–487, 2011.[Mio12] Matteo Mio. On the Equivalence of Game and Denotational Semantics for the Probabilistic µ -calculus. Logical Methods in Computer Science , 8(2), 2012.[Orl85] James B. Orlin. On the Simplex Algorithm for Networks and Generalized Networks. In

Mathe-matical Programming Essays in Honor of George B. Dantzig Part I , pages 166–178. SpringerBerlin Heidelberg, 1985.[PC19] Gabriel Peyr´e and Marco Cuturi. Computational Optimal Transport.

Foundations and Trendsin Machine Learning , 11(5-6):355–607, 2019.[Put94] Martin L. Puterman.

Markov Decision Processes: Discrete Stochastic Dynamic Programming .John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.[Sch99] Alexander Schrijver.

Theory of Linear and Integer Programming . Wiley-Interscience series inDiscrete Mathematics and Optimization. Wiley, 1999.[Seg95] Roberto Segala.

Modeling and Veriﬁcation of Randomized Distributed Real-time Systems . PhDthesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1995.[Sha53] Lloyd S. Shapley. Stochastic Games.

Proceedings of the National Academy of Sciences ,39(10):1095–1100, 1953.[SL94] Roberto Segala and Nancy A. Lynch. Probabilistic Simulations for Probabilistic Processes. In , volume 836 of

LectureNotes in Computer Science , pages 481–496. Springer, 1994.[Str89] James K. Strayer.

Linear Programming and its Applications . Undergraduate Texts in Mathe-matics. Springer-Verlag, New York, NY, USA, 1989.[Tan18] Qiyi Tang.

Computing Probabilistic Bisimilarity Distances . PhD thesis, York University, Toronto,Canada, August 2018.[TvB16] Qiyi Tang and Franck van Breugel. Computing Probabilistic Bisimilarity Distances via PolicyIteration. In , volume 59of

LIPIcs , pages 22:1–22:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.[TvB18a] Qiyi Tang and Franck van Breugel. Deciding Probabilistic Bisimilarity Distance One for LabelledMarkov Chains. In ,volume 10981 of

Lecture Notes in Computer Science , pages 681–699. Springer, 2018.[TvB18b] Qiyi Tang and Franck van Breugel. Deciding Probabilistic Bisimilarity Distance One for Prob-abilistic Automata. In Sven Schewe and Lijun Zhang, editors, , volume 118 of

Leibniz International Proceedings inInformatics (LIPIcs) , pages 9:1–9:17. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018.[vB12] Franck van Breugel. On Behavioural Pseudometrics and Closure Ordinals.

Inf. Process. Lett. ,112(19):715–718, 2012.[vBSW08] Franck van Breugel, Babita Sharma, and James Worrell. Approximating a Behavioural Pseudo-metric without Discount for Probabilistic Systems.

Logical Methods in Computer Science , 4(2),2008.[vBW14] Franck van Breugel and James Worrell. The Complexity of Computing a Bisimilarity Pseudo-metric on Probabilistic Automata. In

Horizons of the Mind. A Tribute to Prakash Panangaden—Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday , volume 8464of

Lecture Notes in Computer Science , pages 191–213. Springer, 2014.[Vil08] C´edric Villani.

Optimal Transport: Old and New . Grundlehren der mathematischen Wis-senschaften. Springer, 2008.[ZP96] Uri Zwick and Mike Paterson. The complexity of mean payoﬀ games on graphs.