[PDF] Causality for nonlocal phenomena

Abstract

Drawing from the theory of optimal transport we propose a rigorous notion of a causal relation for Borel probability measures on a given spacetime. To prepare the ground, we explore the borderland between causality, topology and measure theory. We provide various characterisations of the proposed causal relation, which turn out to be equivalent if the underlying spacetime has a sufficiently robust causal structure. We also present the notion of the 'Lorentz-Wasserstein distance' and study its basic properties. Finally, we discuss how various results on causality in quantum theory, aggregated around Hegerfeldt's theorem, fit into our framework.

Full PDF

aa r X i v : . [ m a t h - ph ] O c t Causality for nonlocal phenomena

Micha l Eckstein a,b , Tomasz Miller c,b a Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian Universityul. prof. Stanis lawa Lojasiewicza 11, 30-348 Krak´ow, Poland b Copernicus Center for Interdisciplinary Studiesul. S lawkowska 17, 31-016 Krak´ow, Poland c Faculty of Mathematics and Information Science, Warsaw University of Technologyul. Koszykowa 75, 00-662 Warsaw, [email protected] [email protected]

Abstract

Drawing from the theory of optimal transport we propose a rigorous notion ofa causal relation for Borel probability measures on a given spacetime. To preparethe ground, we explore the borderland between causality, topology and measuretheory. We provide various characterisations of the proposed causal relation, whichturn out to be equivalent if the underlying spacetime has a suﬃciently robust causalstructure. We also present the notion of the ‘Lorentz–Wasserstein distance’ and studyits basic properties. Finally, we discuss how various results on causality in quantumtheory, aggregated around Hegerfeldt’s theorem, ﬁt into our framework.

The notion of a space, understood as a set of points, provides an indispensable frameworkfor every physical theory. But, regardless of the physical system that is being modelled,the space itself is not directly observable. Indeed, any measuring apparatus can provideinformation about the localisation only up to a ﬁnite resolution. In the relativistic context,it means that the event is an idealised concept, which is not accessible to any observer.Apart from the ‘practical’ obstructions for measuring position, there exist also funda-mental ones due to the quantum eﬀects manifest at small scales. Although non-relativisticquantum mechanics does not impose any a priori restrictions on the accuracy of the posi-tion measurement, in quantum ﬁeld theory a suitable ‘position operator’ is always nonlocal(see for instance [10, 16, 41]). Moreover, an attempt to perform a very accurate measure-ment of localisation in spacetime would require the use of signals of very short wavelength,resulting in an extreme concentration of energy. The latter would eventually lead to blackhole formation and the desired information would become trapped [13, 14].It is generally believed that any physical theory should be causal, i.e. that no infor-mation can be transmitted with the speed exceeding the velocity of light. Indeed, despitesome controversies (compare for example [2] and [26]), no evidence of a physical process1hat would involve superluminal signalling was found in any system (see for instance [47]),even at the level of Planck scale [1].In relativity theory it is straightforward to implement the postulate of causality asthe Lorentzian metric induces a precise notion of causal curves. Although Einstein’s equa-tions admit spacetime solutions with closed causal curves, they are usually discarded asunphysical [23].On the other hand, the status of causality in quantum theory is much more subtlebecause of its nonlocal nature. Hegerfeldt’s theorem [24] (see also [25, 27, 28]) implies thatduring the evolution of a generic quantum system driven by a Hamiltonian bounded frombelow, an initially localised state immediately develops inﬁnite tails. However, whereasinitial localisation implies the breakdown of Einstein causality, the use of nonlocal statesdoes not guarantee a subluminal evolution. In fact, the results of Hegerfeldt suggest [25]that acausal evolution is a feature of the quantum system and not the state . In other words,if a system impels a superluminal propagation one could use nonlocal states to eﬀectuatea faster-than-light communication.In quantum ﬁeld theory the nonlocality is even more prevailing, but it does not allow forcommunication between spacelike separated regions of spacetime [15]. There is thus strongevidence that quantum theory, despite its inherent nonlocality, conforms to the principle ofcausality [36]. In fact, the request of no faster-than-light signalling is often used as a guidingprinciple to restrain the admissible quantum theories [5, 21] and their possible extensions[34]. In quantum ﬁeld theory it is reﬂected by promoting the principle of microscopiccausality to one of the axioms [22, 40].However, the study of causation in quantum theory (and other nonlocal theories) isfar from being complete. One of the stumbling blocks is the lack of a suitable notionof causality for nonlocal objects, like wave functions. To properly investigate Einstein’sprinciple, one needs to disentangle nonlocality from the potential causality violation, as forinstance the interference fringes can travel superluminally, but cannot be utilised to sendinformation [8]. Also — to our knowledge — in the study of causality in quantum systems,time was treated as an external parameter, whereas the most riveting consequences ofEinstein causality, in particular the existence of horizons, manifest themselves in curvedspacetimes.The aim of this paper is to provide a rigorous notion of a causal relation betweenprobability measures on a given spacetime. These can be utilised to model classical spreadobjects, for instance charge or energy density, as well as quantum probabilities obtainedvia the ‘modulus square principle’ from wave functions. Moreover, one can make use ofprobability measures to take into account experimental errors, as the measurement of anyphysical object’s spacetime localisation would eﬀectively be vitiated by an error resultingfrom the apparatus’ imperfection.We allow the probability measures to be spread also in the timelike direction, as typicalstates in quantum ﬁeld theory extend over the whole spacetime [11]. We work in a generallycovariant framework, hence our deﬁnitions and results apply to any curved spacetime witha suﬃciently rich causal structure.The paper is organised as follows: In Section 2 we recall some basic notions in topology,measure theory and causality, to make the paper self-contained and accessible to a broadrange of researchers. Section 3 contains the ﬁrst result on the verge of causality and mea-surability, which establishes the foundations for the developed theory. The main concepts ‘Localised’ in the context of Hegerfeldt’s theorem usually means of compact support in space, butthe argument extends to states with exponentially bounded tails [25]. causal functions [32, Deﬁnition 2.3], proposedin [17] in a much wider context of noncommutative geometry. In several steps we showthat it encapsulates an intuitive notion of causality for nonlocal objects:Each inﬁnitesimal part of the probability density should travel along a future-directed causal curve.At each step we keep the causality conditions imposed on the underlying spacetimeas low as possible. At the same time we provide several characterisations of causality forprobability measures, which illustrate the concept and provide tools for concrete compu-tations.Motivated by the main result, we put forward in Section 4.2 a deﬁnition of a causalrelation between the probability measures, valid on any spacetime, and study its properties.In particular, we demonstrate that the proposed relation is a partial order in the spaceof Borel probability measures on a given spacetime M , even with a relatively poor causalstructure.Finally, drawing from the theory of optimal transport adapted to the relativistic settingwe propose in Section 5 a notion of the ‘Lorentz–Wasserstein’ distance in the space ofmeasures.We conclude, in Section 6, with an outlook into the possible future developments andapplications. In particular, we brieﬂy discuss the potential use of the presented results inthe study of causality in quantum theory. We also address the interrelation of probabilitymeasures with states on C ∗ -algebras. In this way we provide a link with the notion of‘causality in the space of states’ proposed originally in [17] in the framework of noncom-mutative geometry. Throughout the paper we denote N := { , , . . . } and N := N ∪ { } .The space of continuous, continuous and bounded, continuous and compactly supportedreal-valued functions on a topological space M will be respectively denoted by C ( M ), C b ( M ), C c ( M ). Analogous spaces of smooth functions will be respectively denoted by C ∞ ( M ), C ∞ b ( M ), C ∞ c ( M ). Let M denote a topological space and let X ⊆ M . The closure, interior, boundary andcomplement of X will be denoted, respectively, by X , int X , ∂ X and X c .An open cover of X ⊆ M is a family { U α } α ∈ A of open subsets of M such that S α ∈ A U α ⊇X . M is called Lindel¨of iﬀ every its open cover has a countable subcover.A subset

X ⊆ M is called compact iﬀ every its open cover has a ﬁnite subcover. It iscalled sequentially compact iﬀ every sequence in X has a subsequence convergent in X . Itis called precompact (or relatively compact ) iﬀ its closure is compact. Finally, it is called σ -compact iﬀ it is a countable union of compact subsets. In particular, M is σ -compact ifand only if it admits an exhaustion by compact sets , that is a sequence ( K n ) n ∈ N of compactsets such that K n ⊆ K n +1 and S ∞ n =1 K n = M .3f M is Hausdorﬀ, then every its compact subset is closed. If M is second-countable ,that is if M has a countable base, then the notions of compactness and sequential com-pactness coincide.A Hausdorﬀ space M is called locally compact iﬀ every its point has a precompactneighbourhood. M is called separable iﬀ there exists a countable subset { a n } n ∈ N ⊆ M dense in M .Everyopen subspace of a separable space is itself separable. M is called (completely) metrisable iﬀ there exists a (complete) metric ρ : M → R ≥ inducing its topology. Fixing a metricallows one to talk about balls . By B ( x, ε ) := { y ∈ M | ρ ( x, y ) < ε } we denote an open ball centered at x ∈ M of radius ε >

0. By B ( x, ε ) we denote its closure. Finally, M is called Polish iﬀ it is separable and completely metrisable.In the following, we are going to work with spacetimes (see Section 2.3), which areexamples of second-countable locally compact Hausdorﬀ (LCH) spaces. Every such spaceis • Lindel¨of [46, Theorem 16.9]; • Polish, because [46, Theorems 19.3 & 23.1] imply that every second-countable LCHis metrisable, and by [39, Corollary 2.3.32] this means that every second-countableLCH is Polish; • σ -compact, because by taking a countable set { a n } n ∈ N dense in M (existing byseparability), and denoting by U n a precompact neighbourhood of a n (existing bylocal compactness), one has that M = ∞ S n =1 U n .Moreover, every open subspace of a second-countable LCH space is itself a second-countable LCH space [46, Theorem 18.4].LCH spaces satisfy the somewhat modiﬁed version of Urysohn’s lemma [38, 2.12] Theorem 1. (Urysohn’s lemma, LCH version) Let M be a LCH space and let K ⊆ U ⊆M , where K is compact and U is open. Then, there exists f ∈ C c ( M ) such that f | K ≡ , ≤ f ≤ and supp f ⊆ U . Let M be a topological space. The σ -algebra of Borel sets B ( M ) is the smallest family ofsubsets of M containing the open sets, which is closed under complements and countableunions (and hence also under countable intersections). If M is a Hausdorﬀ space then, inparticular, its σ -compact subsets are Borel.A function f : M → M between topological spaces is called Borel iﬀ f − ( V ) ∈B ( M ) for any V ∈ B ( M ). Every continuous (or even semi-continuous) real-valuedfunction is Borel, but not vice versa .A Borel probability measure on M is a function µ : B ( M ) → R ≥ satisfying µ ( M ) = 1and µ (cid:18) ∞ S n =1 X n (cid:19) = ∞ P n =1 µ ( X n ) for any {X n } n ∈ N ⊆ B ( M ) such that X n ∩ X m = ∅ for n = m .A Borel set whose measure µ is zero is called µ -null . The pair ( M , µ ) is called a probabilityspace . The set of Borel probability measures on M will be denoted by P ( M ). There exist many deﬁnitions of local compactness, which are all equivalent in Hausdorﬀ spaces. µ ∈ P ( M ) has the following properties [38, Theorem 1.19]: • µ ( ∅ ) = 0; • µ is monotone , i.e. ∀ X , X ∈ B ( M ) X ⊆ X ⇒ µ ( X ) ≤ µ ( X ); • µ is countably subadditive , i.e ∀ {X n } n ∈ N ⊆ B ( M ) µ (cid:18) ∞ S n =1 X n (cid:19) ≤ ∞ P n =1 µ ( X n ); • for any sequence ( X n ) n ∈ N ⊆ B ( M ) which is increasing , i.e. X n ⊆ X n +1 , it is true that µ ∞ [ n =1 X n ! = lim n → + ∞ µ ( X n ); (1) • for any sequence ( X n ) n ∈ N ⊆ B ( M ) which is decreasing , i.e. X n ⊇ X n +1 , it is true that µ ∞ \ n =1 X n ! = lim n → + ∞ µ ( X n ) . (2)Furthermore, if M is metrisable, then every µ ∈ P ( M ) is regular [39, Lemma 3.4.14], i.e. ∀ X ∈ B ( M ) µ ( X ) = sup { µ ( F ) | F ⊆ X , F closed } (3)= inf { µ ( U ) | U ⊇ X , U open } . Finally, if M is Polish, then every µ ∈ P ( M ) is also tight [39, Theorem 3.4.20], i.e. ∀ X ∈ B ( M ) µ ( X ) = sup { µ ( K ) | K ⊆ X , K compact } (4)Borel probability measures with properties (3) and (4) are called Radon probability mea-sures . Since we will be working with spacetimes (which are Polish spaces), all elements of P ( M ) will be Radon. For simplicity, from now on the term ‘measure’ will always standfor the ‘Borel probability measure’.For any X ⊆ M its indicator function X : M → R is deﬁned by X ( p ) = 1 for p ∈ X and X ( p ) = 0 otherwise. X is a Borel function iﬀ X ∈ B ( M ).A simple function on M is any function s : M → R whose range s ( M ) is ﬁnite.Such a function can be written in the form s = n P i =1 α i X i where s ( M ) = { α , . . . , α n } and X i = s − ( α i ) ( i = 1 , . . . , n ). Notice that s is Borel iﬀ all X i ’s are Borel sets.For any µ ∈ P ( M ) the (Lebesgue) integral of a Borel nonnegative function f is deﬁnedas Z M f dµ := sup ( n X i =1 α i µ ( X i ) (cid:12)(cid:12) n X i =1 α i X i ≤ f ) . It is well-deﬁned by [38, Theorem 1.17], albeit it might be inﬁnite. Now for any Borel func-tion f introduce two nonnegative Borel functions f ± := max {± f, } and deﬁne R M f dµ := The indicator function X is sometimes called the characteristic function of X , but this term hasanother unrelated meaning in probability theory, which might cause confusion. M f + dµ − R M f − dµ if at least one of the integrals is ﬁnite. For any X ∈ B ( M ) one ad-ditionally deﬁnes R X f dµ := R M X f dµ . A function f is µ -integrable iﬀ it is Borel and R M | f | dµ < + ∞ . The space of µ -integrable functions is denoted by L ( M , µ ). Observe thatBorel bounded functions are µ -integrable for any µ ∈ P ( M ).We will often use the following classical theorem [38, Theorem 1.34]. Theorem 2. (Lebesgue’s dominated convergence theorem) Let ( f n ) n ∈ N be a sequence ofBorel functions on M such that f n → f pointwise. For any µ ∈ P ( M ) , if there exists g ∈ L ( M , µ ) such that | f n | ≤ g for all n ∈ N , then also f ∈ L ( M , µ ) and lim n → + ∞ Z M f n dµ = Z M f dµ. We also recall another classical result, which allows to deﬁne a Radon probabilitymeasure on a LCH space M by means of a positive linear functional on C c ( M ) of norm 1[38, Theorem 2.14]. Theorem 3. (Riesz–Markov–Kakutani representation theorem) Let M be a LCH spaceand let Λ : C c ( M ) → R be a linear map such that • Λ( f ) ≥ for all nonnegative f ∈ C c ( M ) , • sup k f k =1 | Λ( f ) | = 1 , where k · k denotes the supremum norm.Then, there exists a unique Radon probability measure µ ∈ P ( M ) such that Λ( f ) = R M f dµ for all f ∈ C c ( M ) . Any Borel function f : M → M between topological spaces induces the pushforwardmap f ∗ : P ( M ) → P ( M ), µ f ∗ µ . The latter is called a pushforward measure and isdeﬁned through ∀ V ∈ B ( M ) f ∗ µ ( V ) := µ (cid:0) f − ( V ) (cid:1) . As for the integrability, one has that g ∈ L ( M , f ∗ µ ) iﬀ g ◦ f ∈ L ( M , µ ), in which case R M g d ( f ∗ µ ) = R M g ◦ f dµ .Given two probability spaces ( M , µ ) , ( M , µ ), there exists a unique measure µ × µ ∈ P ( M × M ), called the product measure , such that ( µ × µ )( U × U ) = µ ( U ) µ ( U )for any U i ∈ B ( M i ), i = 1 , ω ∈ P ( M × M ), its marginals are deﬁned as (pr i ) ∗ ω ∈ P ( M i ), where pr i : M × M → M i ( i = 1 ,

2) are the canonical projection maps.Obviously, the marginals of the product measure µ × µ are µ and µ , however, usuallythere are many measures on M × M sharing the same pair of marginals.Given a measure µ ∈ P ( M ), its support can be deﬁned as the smallest closed set withfull measure. Symbolically, supp µ := T { F ⊆ M : F closed, µ ( F ) = 1 } .6 .3 Causality theory For a detailed exposition of causality theory the reader is referred to [6, 31, 33, 35].Recall that a spacetime is a connected time-oriented Lorentzian manifold. Causalitytheory introduces and studies certain binary relations between points (i.e. events ) ofa given spacetime M . Namely, for any p, q ∈ M , we say that p causally (chronologically,horismotically) precedes q , what is denoted by p (cid:22) q (resp. p ≪ q , p → q ), iﬀ there existsa piecewise smooth future-directed causal (resp. timelike, null) curve γ : [0 , → M from p to q , i.e. γ (0) = p and γ (1) = q .Clearly the relations (cid:22) and ≪ are transitive and (cid:22) is also reﬂexive. Moreover ([33,Chapter 14, Corollary 1]), ∀ p, q, r ∈ M p ≪ r (cid:22) q ∨ p (cid:22) r ≪ q ⇒ p ≪ q. (5)To denote (cid:22) ( ≪ , → ) understood as a subset of M it is customary to use the symbol J + (resp. I + , E + ). I + is open and equal to int J + , and so the causal structure of M iscompletely determined by the relation (cid:22) and the topology of M . Moreover, I + = J + , ∂I + = ∂J + and E + = J + \ I + .For any X ⊆ M one deﬁnes J + ( X ) := pr (cid:0) ( X × M ) ∩ J + (cid:1) and J − ( X ) := pr (cid:0) ( M × X ) ∩ J + (cid:1) . (6)If X is a singleton, one simply writes J ± ( p ) instead of J ± ( { p } ). Notice that J ± ( X ) = S p ∈X J ± ( p ).Let now U ⊆ M be an open subset of M . One deﬁnes (cid:22) U to be the causal precedencerelation on U treated as a spacetime on its own right. By analogy with J + , we denote J + U := { ( p, q ) ∈ U | p (cid:22) U q } . Notice that J + U ⊆ J + ∩ U , but not necessarily vice versa because p (cid:22) U q requires a piecewise smooth future-directed causal curve from p to q notonly to exist, but also to be contained in U .Analogously to (6), one deﬁnes J ± U ( X ) for any subset X ⊆ M .One similarly introduces I ± ( X ) , I + U , I ± U ( X ) as well as E ± ( X ) , E + U , E ± U ( X ). Observe that,by (5), J + ( X ) = I + ( X ) for any open X ⊆ M .A subset

F ⊆ M is called a future set iﬀ J + ( F ) = F . Similarly, subset P ⊆ M iscalled a past set iﬀ J − ( P ) = P . Usually it is required that future and past sets be open bydeﬁnition. However, if we drop this assumption future and past sets behave more naturallyunder set-theoretical operations. Proposition 1.

F ⊆ M is a future set iﬀ F c is a past set. Proof :

The statement is proven by the following chain of equivalences: J − ( F c ) ⊆ F c ⇔ ∀ s ∈ M [ ∃ r ∈ F c s (cid:22) r ] ⇒ s ∈ F c ⇔ ∀ s ∈ M (cid:2) ∃ r ∈ J + ( s ) \ F (cid:3) ⇒ s

6∈ F⇔ ∀ s ∈ M J + ( s ) \ F = ∅ ⇐ s ∈ F⇔ ∀ s ∈ F J + ( s ) ⊆ F ⇔ [ s ∈F J + ( s ) ⊆ F ⇔ J + ( F ) ⊆ F . Notice that only the inclusion ‘ ⊆ ’ is nontrivial in the deﬁnition of a future (past) set. roposition 2. Let {X α } α ∈ A be a family of future (past) subsets of M . Then also S α ∈ A X α and T α ∈ A X α are future (past) subsets of M . Proof :

Assuming that all X α ’s are future sets, notice that J + (cid:18) S α ∈ A X α (cid:19) = S α ∈ A J + ( X α ) = S α ∈ A X α . If X α ’s are past sets, simply replace J + with J − in the previous sentence.We have thus shown that a union of the family of future (past) sets is a future (past)set. To obtain an analogous result for the intersection, one simply uses Proposition 1 andde Morgan’s laws.A function f : M → R is called • a causal function iﬀ it is non-decreasing along every future-directed causal curve; • a generalised time function iﬀ it is increasing along every future-directed causal curve; • a time function iﬀ it is a continuous generalised time function; • a temporal function iﬀ it is a smooth function with past-directed timelike gradient.Each of the above properties is stronger than the preceding one.Causal functions can be characterised by means of future sets. Proposition 3.

Let M be a spacetime. For any function f : M → R the followingconditions are equivalenti) f is causal,ii) f − (( a, + ∞ )) is a future set for any a ∈ R ,iii) f − ([ a, + ∞ )) is a future set for any a ∈ R . Proof : ‘i) ⇒ ii)’ Assume that f is causal and a ∈ R . If f − (( a, + ∞ )) = ∅ , thenit is trivially a future set. Suppose then that f − (( a, + ∞ )) = ∅ and take any q ∈ J + ( f − (( a, + ∞ ))), which means that there exists p (cid:22) q such that f ( p ) > a . By causalityof f we have that f ( q ) ≥ f ( p ) > a and so q ∈ f − (( a, + ∞ )). We thus obtain the inclusion J + ( f − (( a, + ∞ ))) ⊆ f − (( a, + ∞ )). The other inclusion is obvious.‘ii) ⇒ iii)’ Observe that f − ([ a, + ∞ )) = T ∞ n =1 f − (( a − n , + ∞ )). By ii) and Proposition2, we obtain iii).‘iii) ⇒ i)’ Assume f is not causal, i.e. there exist p, q ∈ M such that p (cid:22) q but f ( p ) > f ( q ). We claim that f − ([ f ( p ) , + ∞ )) is not a future set. Indeed, were it a futureset, then, since it clearly contains p , it would contain q as well. But this would mean that f ( q ) ≥ f ( p ), in contradiction with the assumption.On the other hand, future sets can be characterised by means of their indicator functionbeing causal. Corollary 1.

Let M be a spacetime. F ⊆ M is a future set iﬀ the function F is causal.8 roof : Observe that − F ([ a, + ∞ )) =  M for a ≤ F for 0 < a ≤ ∅ for a > · By equivalence ‘i) ⇔ iii)’ from Proposition 3, we immediately obtain the desired equiva-lence.An admissible measure on M is any η ∈ P ( M ) such that ([6, Deﬁniton 3.19]) • for any nonempty open subset U ⊆ M η ( U ) > • for any p ∈ M the boundaries ∂I ± ( p ) are η -null.To such η one associates the functions t − , t + : M → R , called past and future volumefunctions, respectively, deﬁned via ∀ p ∈ M t ± ( p ) := ∓ η ( I ± ( p )) . Volume functions are causal and semi-continuous and hence Borel.For any p, q ∈ M let ˆ C ( p, q ) denote the set of piecewise smooth future-directed causalcurves from p to q . The Lorentzian distance (or time separation ) is the map d : M → [0 , + ∞ ] deﬁned by d ( p, q ) :=  sup γ ∈ ˆ C ( p,q ) 1 R p − g αβ ˙ γ α ˙ γ β dt if ˆ C ( p, q ) = ∅ C ( p, q ) = ∅ · Its basic properties include:i) For any p, q ∈ M d ( p, q ) > ⇔ p ≪ q .ii) The reverse triangle inequality holds. Namely, for any p, q, r ∈ M p (cid:22) r (cid:22) q ⇒ d ( p, r ) + d ( r, q ) ≤ d ( p, q ) . (7)iii) If there exists a timelike loop through p ∈ M (i.e. a piecewise smooth curve from p to p ), then d ( p, p ) = + ∞ . Otherwise d ( p, p ) = 0.iv) For any p, q ∈ M , if d ( p, q ) ∈ (0 , + ∞ ) then d ( q, p ) = 0.v) The map d is lower semi-continuous [33, Chapter 14, Lemma 17] and hence Borel.The causal ladder is a hierarchy of spacetimes according to strictly increasing require-ments on their causal properties [6]. The rungs of this ladder, from the top to the bottom,read:Globally hyperbolic ⇒ Causally simple ⇒ Causally continuous ⇒ Stably causal ⇒ Strongly causal ⇒ Distinguishing ⇒ Causal ⇒ ChronologicalEach level of the hierarchy can be deﬁned in many equivalent ways. Below we present onlythese deﬁnitions, characterisations and properties, of which we make use in the paper. Forthe complete review of the causal hierarchy, consult [31, Section 3].9 is chronological iﬀ it satisﬁes one of the following equivalent conditions:i) p p for all p ∈ M .ii) No timelike loop exists.iii) Any volume function is increasing along every future-directed timelike curve.iv) d ( p, p ) = 0 for all p ∈ M . M is causal iﬀ it satisﬁes one of the following equivalent conditions:i) The relation (cid:22) is a partial order , meaning that in addition to being reﬂexive andtransitive, it is also antisymmetric.ii) No causal loop exists. M is future (past) distinguishing iﬀ it satisﬁes one of the following equivalent conditions:i) For any p, q ∈ M , the equality I + ( p ) = I + ( q ) (resp. I − ( p ) = I − ( q )) implies that p = q .ii) Any future (past) volume function is a generalised time function [6, Proposition 3.24]. M is distinguishing iﬀ it is both future and past distinguishing. M is strongly causal iﬀ the family { I + ( p ) ∩ I − ( q ) | p, q ∈ M} is a base of the standardmanifold topology of M . It is stably causal iﬀ it admits a time function or, equivalently,iﬀ it admits a temporal function [7]. It is causally continuous iﬀ any volume function isa time function. M is causally simple iﬀ it is causal and satisﬁes one of the following equivalent condi-tions [31, Proposition 3.68]:i) J + ( p ) and J − ( p ) are closed for every p ∈ M ;ii) J + ( K ) and J − ( K ) are closed for every compact K ⊆ M ;iii) J + is a closed subset of M .Before providing a deﬁnition of the top level of the causal hierarchy, recall that a curve γ : ( a, b ) → M with −∞ ≤ a < b ≤ + ∞ is called extendible iﬀ it has a continuous extensiononto [ a, b ) or onto ( a, b ]. Otherwise such a curve is called inextendible . Recall also thata Cauchy hypersurface is a subset

S ⊆ M which is met exactly once by any inextendibletimelike curve. Any such S is a closed achronal (i.e. S ∩ I + = ∅ ) topological hypersurface,met by every inextendible causal curve [33, Chapter 14, Lemma 29.]. However, such an S need not be acausal (i.e. S ∩ J + might be nonempty). M is globally hyperbolic iﬀ it satisﬁes one of the following equivalent conditions:i) M is causal and the sets J + ( p ) ∩ J − ( q ) are compact for all p, q ∈ M ;ii) M admits a smooth temporal function T , the level sets of which are (smooth spacelike)Cauchy hypersurfaces [7].In a globally hyperbolic spacetime the Lorentzian distance d is ﬁnite-valued and continuous.Moreover, for every ( p, q ) ∈ J + there exists a causal geodesic γ of length d ( p, q ) [33, Chapter14]. 10 On the σ -compactness of J +The purpose of this section is to prove the following theorem. Theorem 4.

Let M be a spacetime. Then J + ⊆ M is a σ -compact set. Let us note here that property is automatic in causally simple spacetimes. Indeed, let( K n ) n ∈ N be an exhaustion of M with compact sets and notice that J + = S m,n ∈ N J + ∩ ( K n × K m ). But J + ⊆ M is a closed subset for M causally simple, therefore J + ∩ ( K n × K m )is compact for any m, n ∈ N .In the proof of Theorem 4, however, we shall make no assumptions on the causalproperties of M .Theorem 4 implies that J + is Borel for any spacetime. As we shall see, it also impliesthat J ± ( X ) is Borel for any closed X ⊆ M . Moreover, previous statements are still trueif we replace J ± with E ± .Theorem 4 thus settled in the overlap of causality theory, topology and measure the-ory. Whereas the interplay between the causal and topological properties of spacetimes isrelatively well understood, the question of Borelness of causal futures — a fundamentalone from the point of view of any conceivable measure-theoretical extension of causalitytheory — has never been addressed to authors’ best knowledge.We recall the notion of simple convex sets (called also simple regions ) [35, Section1]. Loosely speaking, they are small patches of the spacetime M with ‘nice’ topological,diﬀerential and causal properties, and which constitute a countable cover of the entirespacetime.Concretely, let M be a spacetime. Then for any p ∈ M there exists a star-shapedneighbourhood Q ⊆ T p M containing the zero vector and such that the exponential mapexp p restricted to Q is a diﬀeomorphism. The image of this diﬀeomorphism exp p ( Q ) iscalled a normal neighbourhood of p . Every event has a neighborhood U which is a normalneighbourhood of any p ∈ U . Such U is called convex . If U ⊆ M is convex, then it is openand for any p, q ∈ U there exists precisely one geodesic from p to q which is contained in U [33, p. 129].From the point of view of causality theory, the following property of convex sets willbe crucial: if U ⊆ M is convex, then J + U is a closed subset of U [33, Lemma 14.2].Finally, a convex set N is called simple iﬀ it is precompact and contained in anotherconvex set U .Any spacetime M can be covered with a family of simple convex sets [35, Proposition1.13]. This cover can be chosen countable, because every spacetime is a Lindel¨of space. Proof of Theorem 4:

Fix a countable, locally ﬁnite family of simple convex sets { N i } i ∈ N covering M . Let also { U i } i ∈ N be a family of convex sets such that ∀ i ∈ N N i ⊆ U i ,which exists by the very deﬁnition of a simple convex sets.We introduce a couple more deﬁnitions.Take any i ∈ N . Recall that J + U i is a closed subset of U i , whereas N i is a compactsubset of U i . Let us ﬁrst deﬁne the following compact subset of U i J +( N i ) := J + U i ∩ N i = n ( p, q ) ∈ N i | p (cid:22) U i q o , that is the set containing all these pairs of points from N i which can be connected bya piecewise smooth future-directed causal curve contained in U i . For any X ⊆ M deﬁne,11y analogy with (6), J +( N i ) ( X ) := pr (cid:16) ( X × M ) ∩ J +( N i ) (cid:17) and J − ( N i ) ( X ) := pr (cid:16) ( M × X ) ∩ J +( N i ) (cid:17) . (8)If X is a singleton, one writes simply J ± ( N i ) ( p ) instead of J ± ( N i ) ( { p } ). Notice that if X isclosed, then J ± ( N i ) ( X ) is a compact subset of U i .For the next deﬁnition, ﬁx i , i ∈ N and introduce J +( N i ,N i ) := (cid:8) ( p, q ) ∈ N i × N i | ∃ r ∈ N i ∩ N i p (cid:22) U i r (cid:22) U i q (cid:9) = [ r ∈ N i ∩ N i J − N i ( r ) × J + N i ( r ) . This is the set of all those pairs of points ( p, q ) ∈ N i × N i , which can be connectedby a concatenation of two piecewise smooth future-directed causal curves, ﬁrst of whichis contained in U i , while the other in U i , and the concatenation point r must lie inthe compact set N i ∩ N i . As above, we additionally deﬁne, for any X ⊆ M , J +( N i ,N i ) ( X ) := pr (cid:16) ( X × M ) ∩ J +( N i ,N i ) (cid:17) and (9) J − ( N i ,N i ) ( X ) := pr (cid:16) ( M × X ) ∩ J +( N i ,N i ) (cid:17) . Finally, ﬁx n ≥ i , i , . . . , i n ∈ N and deﬁne, recursively, J +( N i ,N i ,...,N in ) := n ( p, q ) ∈ N i × N i n | ∃ r ∈ N i n − ( p, r ) ∈ J +( N i ,N i ,...,N in − ) ∧ ( r, q ) ∈ J +( N in − ,N in ) o = [ r ∈ N in − J − ( N i ,N i ,...,N in − ) ( r ) × J +( N in − ,N in ) ( r ) , where, for any X ⊆ M , J +( N i ,N i ,...,N in ) ( X ) := pr (cid:16) ( X × M ) ∩ J +( N i ,N i ,...,N in ) (cid:17) and (10) J +( N i ,N i ,...,N in ) ( X ) := pr (cid:16) ( M × X ) ∩ J +( N i ,N i ,...,N in ) (cid:17) . It is crucial to understand what these sets contain (see Figure 1). Namely, J +( N i ,N i ,...,N in ) is the set of all those pairs of points ( p, q ) ∈ N i × N i n which can be connected by a con-catenation of n − J +( N i ,N i ) . The curves’ concatenation points must lie in N i , N i , . . . , N i n − (in that order).We now claim and shall prove inductively that ∀ n ≥ ∀ i , i , . . . , i n ∈ N J +( N i ,N i ,...,N in ) is a compact subset of U i × U i n , and hence of M . Let us ﬁrst prove the base case n = 2. Let { a m } m ∈ N be a dense subset of N i ∩ N i ,which exists by separability of N i ∩ N i . Of course, { a m } m ∈ N is also a dense subset of N i ∩ N i = N i ∩ N i . Therefore, the family of open balls { B ( a m , k ) } m ∈ N is an open cover12 Sfrag replacements p qN i N i N i N i U i U i U i U i Figure 1: Here ( p, q ) ∈ J +( N i ,N i ,N i ,N i ) . The piecewise smooth curve from p to q shown isassumed causal and future-directed.of N i ∩ N i for any ﬁxed k ∈ N . Because N i ∩ N i is compact, there exists a subcover { B ( a m , k ) } m ∈ F k , where F k ⊆ N is a ﬁnite set of indices.We now claim that J +( N i ,N i ) = \ k ∈ N [ m ∈ F k J − ( N i ) (cid:0) B ( a m , k ) (cid:1) × J +( N i ) (cid:0) B ( a m , k ) (cid:1) (11)= (cid:8) ( p, q ) ∈ N i × N i | ∀ k ∈ N ∃ m ∈ F k ∃ p m , q m ∈ B ( a m , k ) p (cid:22) U i p m ∧ q m (cid:22) U i q (cid:9) , which would already mean that J +( N i ,N i ) is a closed subset of N i × N i (and hence alsoa compact subset of U i × U i ), because ﬁnite unions of closed sets are closed and so areany intersections of closed sets.Indeed, to prove the inclusion ‘ ⊆ ’, assume ( p, q ) ∈ N i × N i is such that there exists r ∈ N i ∩ N i satisfying p (cid:22) U i r (cid:22) U i q . For any k ∈ N , since { B ( a m , k ) } m ∈ F k covers N i ∩ N i , it is possible to ﬁnd m ∈ F k such that r ∈ B ( a m , k ). One can thus simply take p m := r =: q m .On the other hand, to show the inclusion ‘ ⊇ ’, let us assume that ( p, q ) ∈ N i × N i issuch that ∀ k ∈ N ∃ m ∈ F k ∃ p m , q m ∈ B ( a m , k ) p (cid:22) U i p m and q m (cid:22) U i q. We can thus construct the sequence { a m k } k ∈ N , which, being contained in the compact set N i ∩ N i , has a subsequence { a m kl } l ∈ N convergent to some a ∞ ∈ N i ∩ N i . Notice nowthat because p m k , q m k ∈ B ( a m k , k ) for any k ∈ N , therefore alsolim l → + ∞ p m kl = lim l → + ∞ q m kl = a ∞ . We now invoke the fact that J + U i and J + U i are closed subsets of U i and of U i , respectively.It implies that p (cid:22) U i a ∞ (cid:22) U i q, which completes the proof of (11) and of the base case of the induction.We now move to the proof of the inductive step, which essentially goes along the samelines as the proof of the base case. 13he assumption is that J +( N i ,N i ,...,N in ) is a compact subset of U i × U i n for any i , . . . , i n ∈ N . The induction hypothesis states that J +( N i ,N i ,...,N in +1 ) is a compact subset of U i × U i n +1 for any i , . . . , i n +1 ∈ N .Let { a m } m ∈ N denote now a dense subset of N i n , and hence also a dense subset of N i n .Similarly as before, for each k ∈ N consider the family { B ( a m , k ) } m ∈ N covering N i n , andtake its ﬁnite subcover { B ( a m , k ) } m ∈ F k .We now claim that J +( N i ,N i ,...,N in +1 ) = \ k ∈ N [ m ∈ F k J − ( N i ,N i ,...,N in ) (cid:0) B ( a m , k ) (cid:1) × J +( N in ,N in +1 ) (cid:0) B ( a m , k ) (cid:1) (12)= n ( p, q ) ∈ N i × N i n +1 | ∀ k ∈ N ∃ m ∈ F k ∃ p m , q m ∈ B ( a m , k )( p, p m ) ∈ J +( N i ,N i ,...,N in ) ∧ ( q m , q ) ∈ J +( N in ,N in +1 ) o , which would already mean that J +( N i ,N i ,...,N in +1 ) is a closed subset of N i × N i n +1 (and hencealso a compact subset of U i × U i n +1 ), because we already know that J − ( N i ,N i ,...,N in ) (cid:0) B ( a m , k ) (cid:1) is closed in N i (by the induction assumption and deﬁnitions (10)) and that J +( N in ,N in +1 ) (cid:0) B ( a m , k ) (cid:1) is closed in N i n +1 (by the base case and deﬁnitions (9)).To show the inclusion ‘ ⊆ ’ in (12), assume ( p, q ) ∈ N i × N i n +1 is such that there exists r ∈ N i n satisfying ( p, r ) ∈ J +( N i ,N i ,...,N in ) and ( r, q ) ∈ J +( N in ,N in +1 ) . For any k ∈ N , since { B ( a m , k ) } m ∈ F k covers N i n , it is possible to ﬁnd m ∈ F k such that r ∈ B ( a m , k ). One canthus simply take p m := r =: q m .On the other hand, to show the inclusion ‘ ⊇ ’, let us assume that ( p, q ) ∈ N i × N i n +1 are such that ∀ k ∈ N ∃ m ∈ F k ∃ p m , q m ∈ B ( a m , k ) ( p, p m ) ∈ J +( N i ,N i ,...,N in ) , ( q m , q ) ∈ J +( N in ,N in +1 ) . We can thus construct the sequence ( a m k ) k ∈ N , which, being contained in the compact set N i n , has a subsequence ( a m kl ) l ∈ N convergent to some a ∞ ∈ N i n . Analogously as before, weargue that also the sequences ( p m k ) , ( q m k ) have subsequences converging to a ∞ .We now invoke the induction assumption and deﬁnitions (10), which together implythat J +( N i ,N i ,...,N in ) is a compact (and hence closed) subset of U i × U i n and therefore( p, a ∞ ) ∈ J +( N i ,N i ,...,N in ) .On the other hand, invoking the base case and deﬁnitions (9), we also have that J +( N in ,N in +1 ) is a compact (and hence closed) subset of U i n × U i n +1 and so ( a ∞ , q ) ∈ J +( N in ,N in +1 ) .This completes the proof of (12) and of the entire induction.Altogether, we can thus write that ∀ n ∈ N ∀ i , i , . . . , i n ∈ N (13) J +( N i ,N i ,...,N in ) is a compact subset of U i × U i n , and hence of M . Bearing the above in mind, the σ -compactness of J + will be proven if we show that J + = ∞ [ n =1 [ i ,i ,...,i n ∈ N J +( N i ,N i ,...,N in ) . (14)14n order to show the inclusion ‘ ⊆ ’, take any ( p, q ) ∈ J + and let γ : [0 , → M bea piecewise smooth future-directed causal curve from p to q .Consider the inverse images γ − ( N i ), i ∈ N . By continuity of γ , they are all opensubsets of [0 , disconnected (i.e. they need not be intervals).Nevertheless, every γ − ( N i ) is a union of its connected components, which are all open subintervals of [0 , γ − ( N i )’s, i ∈ N .This family is a cover of [0 ,

1] and, since the latter is a compact space, we can take its ﬁnitesubcover I := { I , I , . . . , I n } , where each of the intervals I j , ( j = 1 , . . . , n ) is a connectedcomponent of some (possibly not unique) γ − ( N i j ). Therefore ∀ j = 1 , . . . , n γ ( I j ) ⊆ N i j and, by the continuity of γ , ∀ j = 1 , . . . , n γ ( I j ) ⊆ N i j . Without loss of generality, we can assume that I j I j for all j = j . Bearing this inmind, we can rewrite I either as { [0 , } (the trivial cover) or, if n >

1, as I = { [0 , b ) , ( a , b ) , . . . , ( a n − , b n − ) , ( a n , } , where 0 < a < a < . . . < a n <

1. Notice also that b j > a j +1 for j = 1 , . . . , n −

1, becauseotherwise such I would not be a cover.In the ﬁrst (trivial) case, γ ([0 , ⊆ N i ⊆ N i for some i ∈ N and hence ( p, q ) ∈ J +( N i ) .In the second case, observe that γ ([0 , a ]) ⊆ γ ([0 , b )) ⊆ N i ⊆ N i ,γ ([ a , a ]) ⊆ γ ([ a , b ]) ⊆ N i ,. . .γ ([ a j , a j +1 ]) ⊆ γ ([ a j , b j ]) ⊆ N i j ,. . .γ ([ a n − , a n ]) ⊆ γ ([ a n − , b n − ]) ⊆ N i n − ,γ ([ a n , ⊆ N i n , for some i , . . . , i n ∈ N and hence ( p, q ) ∈ J +( N i ,...,N in ) .In either case, we obtain that ( p, q ) ∈ ∞ S n =1 S i ,i ,...,i n ∈ N J +( N i ,N i ,...,N in ) .In order to show the other inclusion ‘ ⊇ ’ in (14), notice simply that a concatenation ofﬁnitely many piecewise smooth future-directed causal curves is itself a piecewise smoothfuture-directed causal curve. Therefore, if ( p, q ) ∈ J +( N i ,N i ,...,N in ) , then ( p, q ) ∈ J + . Corollary 2.

Let M be a spacetime. Then E + is a σ -compact subset of M . Locally compact spaces (and [0 ,

1] is such a space) can be charaterised as the spaces in which everyconnected component of every open set is itself open. roof : On the strength of (14), we have that E + := J + \ I + = ∞ [ n =1 [ i ,i ,...,i n ∈ N J +( N i ,N i ,...,N in ) \ I + and since I + is an open subset of M , therefore J +( N i ,N i ,...,N in ) \ I + is a closed subset of N i × N i n (for any i , i , . . . , i n ∈ N ), and hence a compact subset of M . Corollary 3.

Let M be a spacetime and let X ⊆ M be a countable union of closed sets.Then J ± ( X ) and E ± ( X ) are σ -compact subsets of M . Proof :

By assumption, X = ∞ S m =1 X m , where for any m ∈ N , X m ⊆ M is closed.Observe that, by (14), J + ( X ) := pr ∞ [ m =1 X m × M ! ∩ ∞ [ n =1 [ i ,i ,...,i n ∈ N J +( N i ,N i ,...,N in ) ! = pr ∞ [ m =1 ∞ [ n =1 [ i ,i ,...,i n ∈ N ( X m × M ) ∩ J +( N i ,N i ,...,N in ) ! = ∞ [ m =1 ∞ [ n =1 [ i ,i ,...,i n ∈ N pr (cid:16) ( X m × M ) ∩ J +( N i ,N i ,...,N in ) (cid:17) . For any m, n ∈ N and any i , i , . . . , i n ∈ N the set ( X m × M ) ∩ J +( N i ,N i ,...,N in ) is closedin N i × N i n and hence compact in M . Since pr is a continuous map, the projection ofa compact set is itself compact and we obtain that J + ( X ) is σ -compact.The proof for J − ( X ) is completely analogous. Moreover, by the previous corollary, re-placing J ± with E ± in the above proof yields the desired result for the horismotical futuresand pasts.The ﬁnal corollary shows that the volume functions can be deﬁned by means of causalfutures instead of the chronological ones. Corollary 4.

Let M be a spacetime and η ∈ P ( M ) be an admissible measure. Thenthe volume functions t ± associated to η satisfy t ± ( p ) = ∓ η ( J ± ( p )) for all p ∈ M . Moreover, η ( E ± ( p )) = 0 for all p ∈ M . Proof :

By the previous corollary, E ± ( p ) and J ± ( p ) are Borel sets for any p ∈ M andso the expressions η ( E ± ( p )) and η ( J ± ( p )) are well deﬁned. Since it is true that ∀ p ∈ M I − ( p ) ⊆ J − ( p ) ⊆ J − ( p ) = I − ( p ) = I − ( p ) ∪ ∂I − ( p ) , with I − ( p ) ∩ ∂I − ( p ) = ∅ , therefore t − ( p ) = η ( I − ( p )) ≤ η ( J − ( p )) ≤ η ( J − ( p )) = η ( I − ( p )) = η ( I − ( p )) + η ( ∂I − ( p )) | {z } = 0 = t − ( p ) , where we have used the second condition in the deﬁnition of an admissible measure. There-fore, t − ( p ) = η ( J − ( p )). The proof for t + is analogous.Moreover, since I ± ( p ) ⊆ J ± ( p ) for any p ∈ M , therefore η ( E ± ( p )) = η ( J ± ( p ) \ I ± ( p )) = η ( J ± ( p )) − η ( I ± ( p )) = 0. 16 Causality for probability measures

The aim of this section is to extend the causal precedence relation (cid:22) onto the space ofmeasures P ( M ) on a given spacetime M . We begin by invoking certain characterisationof causality between events.Let C ( M ) denote the set of smooth bounded causal functions on the spacetime M . Theorem 5.

Let M be a globally hyperbolic spacetime. For any p, q ∈ M the followingconditions are equivalent1 ⋄ ∀ f ∈ C ( M ) f ( p ) ≤ f ( q ) ,2 ⋄ p (cid:22) q . The proof, based on a result by Besnard [9], can be found in [17, Proposition 10] (seealso [30]). Actually, as we shall see later, the above characterisation is valid also in causallysimple spacetimes (cf. Corollary 6).As an important side note, observe that Theorem 5 exactly mirrors the deﬁnition ofa causal function. Indeed, the latter can be written symbolically as f a causal function iﬀ ∀ ( p, q ) ∈ J + f ( p ) ≤ f ( q ) , whereas Theorem 5 in fact says that( p, q ) ∈ J + iﬀ ∀ f a causal function f ( p ) ≤ f ( q ) . Therefore, instead of using (cid:22) to deﬁne what a causal function is, one can come up withan abstract, suitably structurised set C of ‘smooth bounded causal functions’ and deﬁne (cid:22) through C using the analogue of Theorem 5. This was done by Eckstein and Franco in[17] in very general context of noncommutative geometry.Condition 1 ⋄ provides a ‘dual’ deﬁnition of the causal precedence, which actually sug-gests how (cid:22) could be extended onto P ( M ) . Deﬁnition 1.

Let M be a globally hyperbolic spacetime. For any µ, ν ∈ P ( M ) we saythat µ causally precedes ν (symbolically µ (cid:22) ν ) iﬀ ∀ f ∈ C ( M ) Z M f dµ ≤ Z M f dν. In [17] it is proven (in a much more general context) that the above deﬁned relationis in fact a partial order. This deﬁnition, however, has two shortcomings. Firstly, itis well motivated only on globally hyperbolic spacetimes. Secondly, the intuitive notionof causality for spread objects, as phrased in the introduction, is not directly visible inDeﬁnition 1.

In the following, we provide various conditions which are equivalent to the above deﬁni-tion of a causal relation between measures. Moreover, in some of the implications theassumption on global hyperbolicity of M can be relaxed.The ﬁrst result states that if C ( M ) is suﬃciently rich, one can abandon the smoothnessrequirement. 17 heorem 6. Let M be a stably causal spacetime. For any µ, ν ∈ P ( M ) the followingconditions are equivalent:1 • For all f ∈ C ( M ) Z M f dµ ≤ Z M f dν. (15) • For all causal f ∈ C b ( M ) Z M f dµ ≤ Z M f dν. (16) Proof : (1 • ⇒ • ) Relying on [12, Corollary 5.4 and the subsequent comments] we usethe fact that in stably causal spacetimes any time function can be uniformly approximatedby a smooth time (or even temporal) function.Using the stable causality, ﬁx a temporal function T : M → R . For any ε >

0, thefunction f + ε arctan T is a time function which clearly approximates f uniformly. Bythe above mentioned corollary, this function in turn can be approximated by a smoothtime function f ε such that ∀ p ∈ M | f ( p ) + ε arctan T ( p ) − f ε ( p ) | < ε. (17)Clearly f ε ∈ C ( M ), therefore by 1 • Z M f ε dµ ≤ Z M f ε dν. To obtain 2 • it now remains to observe that for any measure η ∈ P ( M ) it is true thatlim ε → + R M f ε dη = R M f dη .Indeed, for any η ∈ P ( M ) and ε > (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z M f dη − Z M f ε dη (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Z M | f − f ε | dη ≤ Z M | f + ε arctan T − f ε | dη + ε Z M | arctan T | dη ≤ ε (cid:0) π (cid:1) , where we have used (17).(2 • ⇒ • ) Trivial.The next result characterises the relation (cid:22) between measures in terms of open futuresets. Theorem 7.

Let M be a causally continuous spacetime. For any µ, ν ∈ P ( M ) conditions • and • are equivalent to the following condition3 • For every open future set

F ⊆ M µ ( F ) ≤ ν ( F ) . (18)18 roof : (2 • ⇒ • ) Fix an open future set F ⊆ M and let η be an admissible measureon M . For any λ ∈ (0 ,

1] construct a new admissible measure η λ := λη + (1 − λ ) η ( · ∩ F )and consider the associated past volume function t − λ deﬁned via ∀ p ∈ M t − λ ( p ) := η λ ( I − ( p )) = λη ( I − ( p )) + (1 − λ ) η ( I − ( p ) ∩ F )= η ( I − ( p ) ∩ F ) + λη ( I − ( p ) \ F ) . Because M is causally continuous, t − λ is a time function for any λ ∈ (0 , n ∈ N deﬁne an increasing function ϕ n ∈ C ∞ b ( R ) by ∀ x ∈ R ϕ n ( x ) := + π arctan (cid:0) n x − n (cid:1) . The sequence of functions ( ϕ n ) is pointwise convergent to the indicator function of R > .Moreover, also ( ϕ n ◦ t − λ ) is a bounded time function for every n ∈ N and λ ∈ (0 , • ,this means that Z M ϕ n (cid:0) η ( I − ( p ) ∩ F ) + λη ( I − ( p ) \ F ) (cid:1) dµ ( p ) ≤ Z M ϕ n (cid:0) η ( I − ( p ) ∩ F ) + λη ( I − ( p ) \ F ) (cid:1) dν ( p ) . Since the functions ϕ n are bounded and continuous, we can invoke Lebesgue’s domi-nated convergence theorem and ﬁrst take λ → + , obtaining Z M ϕ n (cid:0) η ( I − ( p ) ∩ F ) (cid:1) dµ ( p ) ≤ Z M ϕ n (cid:0) η ( I − ( p ) ∩ F ) (cid:1) dν ( p )and then take n → + ∞ , which yields Z M R > (cid:0) η ( I − ( p ) ∩ F ) (cid:1) dµ ( p ) ≤ Z M R > (cid:0) η ( I − ( p ) ∩ F ) (cid:1) dν ( p ) . It is now crucial to notice that the function p η ( F ∩ I − ( p )) is positive on F and zero on M \ F . These observations follow from the deﬁnition of an admissible measure and of F ,and together with the above inequality of integrals they imply that µ ( F ) ≤ ν ( F ) . (3 • ⇒ • ) Let f ∈ C b ( M ) be causal and let T be a temporal function on M . For any ε > f ε := f + ε arctan T .Denote m := inf p ∈M f ε ( p ) and M := sup p ∈M f ε ( p ). For any ﬁxed n ∈ N deﬁne the sets F ( n ) k := f − ε (cid:0)(cid:0) m + k M − mn , + ∞ (cid:1)(cid:1) , k = 1 , , . . . , n − . Because f ε is continuous and causal, all F ( n ) k ’s are open future sets (cf. Proposition 3).For any ﬁxed n ∈ N let us consider the following simple function s n := m + n − X k =1 M − mn F ( n ) k . Such a function exists because causal continuity implies stable causality. In fact, in the proof of(3 • ⇒ • ) we only need M be stably causal.

19y 3 • , we obtain the following inequality of integrals Z M s n dµ = m + n − X k =1 M − mn µ (cid:16) F ( n ) k (cid:17) ≤ m + n − X k =1 M − mn ν (cid:16) F ( n ) k (cid:17) = Z M s n dν. (19)It is not diﬃcult to realise that ∀ p ∈ M [ ∀ n ∈ N s n ( p ) < f ε ( p )] and lim n → + ∞ s n ( p ) = f ε ( p ) . More concretely, one can show that ∀ p ∈ M f ε ( p ) − s n ( p ) ∈ (0 , M − mn ] . (20)Indeed, the very deﬁnition of F ( n ) k ’s implies that F ( n )1 ⊃ F ( n )2 ⊃ . . . ⊃ F ( n ) n − , therefore if p ∈ F ( n ) k for some k ∈ { , . . . , n − } , then p ∈ F ( n ) j for all j ∈ { , . . . , k } . This impliesthat s n ( p ) = m + n − X k =1 M − mn F ( n ) k ( p ) = m + M − mn max n k | p ∈ F ( n ) k o = m + M − mn max (cid:8) k | m + k M − mn < f ε ( p ) (cid:9) = m + M − mn max (cid:8) k | k < nM − m ( f ε ( p ) − m ) (cid:9) = m + M − mn (cid:6) nM − m ( f ε ( p ) − m ) − (cid:7) , where ⌈·⌉ denotes the ceiling function. Using the fact that x − ⌈ x − ⌉ ∈ (0 ,

1] for any x ∈ R , we obtain that f ε ( p ) − s n ( p ) = M − mn (cid:16) nM − m ( f ε ( p ) − m ) − (cid:6) nM − m ( f ε ( p ) − m ) − (cid:7) (cid:17) ∈ (cid:0) , M − mn (cid:3) , which proves (20).Invoking now Lebesgue’s dominated convergence theorem and passing with n → + ∞ in (19) we obtain Z M f ε dµ ≤ Z M f ε dν. Invoking Lebesgue’s theorem again, we pass with ε → + and obtain 2 • .The third and the most important result concerns causally simple spacetimes. Weshow that condition 3 • extends to diﬀerent kinds of future sets. Moreover, we introducea condition that uses the existential quantiﬁer. Theorem 8.

Let M be a causally simple spacetime. For any µ, ν ∈ P ( M ) conditions • , • and • are equivalent to all the following conditions4 • For every compact

K ⊆ M µ ( J + ( K )) ≤ ν ( J + ( K )) . (21) • For every Borel future set

F ⊆ M µ ( F ) ≤ ν ( F ) . (22)20 • For all ϕ, ψ ∈ C b ( M ) (cid:2) ∀ p, q ∈ M p (cid:22) q ⇒ ϕ ( p ) ≤ ψ ( q ) (cid:3) ⇒ Z M ϕ dµ ≤ Z M ψ dν. (23) • There exists ω ∈ P ( M ) such thati) (pr ) ∗ ω = µ and (pr ) ∗ ω = ν ;ii) ω ( J + ) = 1 . Proof : (3 • ⇒ • ) Let K be a compact subset of M . Fix n ∈ N and cover K with open balls ofradius n , concretely K ⊆ [ x ∈K B (cid:0) x, n (cid:1) . PSfrag replacements K S x ∈K J + (cid:0) B (cid:0) x, n (cid:1) (cid:1) Figure 2: (cid:8) J + (cid:0) B (cid:0) x, n (cid:1) (cid:1)(cid:9) x ∈K covers J + ( K ).Hence ∀ n ∈ N J + ( K ) ⊆ J + [ x ∈K B (cid:0) x, n (cid:1)! = [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1) . (24)We claim that J + ( K ) = ∞ \ n =1 [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1) . (25)By (24), it suﬃces to prove the inclusion ‘ ⊇ ’.Suppose then that q ∈ ∞ T n =1 S x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1) , which means that ∀ n ∈ N ∃ x n ∈ K ∃ p n ∈ B ( x n , n ) p n (cid:22) q. Since K is compact, the sequence ( x n ) has a convergent subsequence ( x n k ), lim k → + ∞ x n k = x ∞ ∈ K . Notice that also the subsequence ( p n k ) converges to x ∞ . But because J + is21 closed set in the case of a causally simple spacetime, the fact that for every k ∈ N p n k (cid:22) q implies that x ∞ (cid:22) q and therefore q ∈ J + ( K ).By 3 • we know that µ [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! = µ [ x ∈K I + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! ≤ ν [ x ∈K I + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! = ν [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! (26)Since for all n ∈ N it is true that J + (cid:0) B (cid:0) x, n (cid:1)(cid:1) ⊇ J + (cid:0) B (cid:0) x, n +1 (cid:1)(cid:1) , therefore, by (1), µ (cid:0) J + ( K ) (cid:1) = µ ∞ \ n =1 [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! = lim n → + ∞ µ [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! ≤ lim n → + ∞ ν [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! = ν ∞ \ n =1 [ x ∈K J + (cid:0) B (cid:0) x, n (cid:1)(cid:1)! = ν (cid:0) J + ( K ) (cid:1) , where we have also used (25) and (26), thus proving 4 • .(4 • ⇒ • ) Let F ⊆ M be any Borel future set. For any

K ⊆ F it is then true that J + ( K ) ⊆ F . Therefore µ ( K ) ≤ µ ( J + ( K )) ≤ µ ( F ) . In the above chain of inequalities let us take the supremum over all compact

K ⊆ F . Usingthe tightness of µ (see (4)), we have µ ( F ) = sup { µ ( K ) | K ⊆ F , K compact } ≤ sup (cid:8) µ ( J + ( K )) | K ⊆ F , K compact (cid:9) ≤ µ ( F ) , and so µ ( F ) = sup (cid:8) µ ( J + ( K )) | K ⊆ F , K compact (cid:9) and similarly for the measure ν . As we can see, in order to obtain 5 • from 4 • it is enoughto take the supremum over all compact K ⊆ F .(5 • ⇒ • ) Trivial — open sets are Borel.(2 • ⇒ • ) In the ﬁrst step of the proof we will show that 6 • holds for all nonnegative ϕ, ψ ∈ C b ( M ) with ϕ compactly supported . Namely, for such functions we will show thatthe condition ∀ p, q ∈ M p (cid:22) q ⇒ ϕ ( p ) ≤ ψ ( q ) (27)implies the inequality of integrals Z M ϕ dµ ≤ Z M ψ dν. (28)22hen, in the second step, we will demonstrate that the assumptions of nonnegativity of ϕ, ψ and of the compactness of supp ϕ can in fact be abandoned.Deﬁne a function ˆ ϕ : M → R via ˆ ϕ ( p ) := max x (cid:22) p ϕ ( x ). Function ˆ ϕ is well-deﬁned, becausefor every p ∈ M the function ϕ , being continuous, attains its maximum over the compact set J − ( p ) ∩ supp ϕ . Moreover, ˆ ϕ satisﬁes ∀ p , p , q ∈ M p (cid:22) p (cid:22) q ⇒ ϕ ( p ) ≤ ˆ ϕ ( p ) ≤ ψ ( q ) (29)Indeed, ﬁrst inequality follows directly from the very deﬁnition of ˆ ϕ . In order to obtainthe second inequality, notice ﬁrst that by (27) we have ϕ ( p ) ≤ ψ ( q ). By transitivity ofthe relation (cid:22) , this inequality holds also if we replace p with any x (cid:22) p . Henceˆ ϕ ( p ) = max x (cid:22) p ϕ ( x ) ≤ ψ ( q )and (29) is proven.Function ˆ ϕ is obviously nonnegative, bounded and, by transitivity of (cid:22) , it is causal.We claim that it is also continuous.Indeed, let us show that for any α, β ∈ R ( α < β ) the preimage ˆ ϕ − (( α, β )) is open.Notice ﬁrst that if β ≤ ϕ , the preimage ˆ ϕ − (( α, β )) is emptyand hence open. Therefore, we can assume from now on that β > ϕ − (( α, + ∞ )) = I + ( ϕ − (( α, + ∞ ))). This is proven by the following chainof equivalences p ∈ ˆ ϕ − (( α, + ∞ )) ⇔ ˆ ϕ ( p ) > α ⇔ max x (cid:22) p ϕ ( x ) > α ⇔ ∃ x (cid:22) p ϕ ( x ) > α ⇔ ∃ x ∈ ϕ − (( α, + ∞ )) x (cid:22) p ⇔ p ∈ J + (cid:0) ϕ − (( α, + ∞ )) (cid:1) and by the observation that, because ϕ is continuous, ϕ − (( α, + ∞ )) is open and hence J + ( ϕ − (( α, + ∞ ))) = I + ( ϕ − (( α, + ∞ ))).Similarly, observe that ˆ ϕ − ([ β, + ∞ )) = J + ( ϕ − ([ β, + ∞ ))). This is proven by a chainof equivalences analogous to the one above. Notice that because ϕ is continuous, the preim-age ϕ − ([ β, + ∞ )) is closed. Moreover, since ϕ is nonnegative and β >

0, therefore ϕ − ([ β, + ∞ )) is contained in the support of ϕ . But the latter is compact, and so the preim-age ϕ − ([ β, + ∞ )), being a closed subset of a compact set, is itself compact. By the causalsimplicity of M , the set J + ( ϕ − ([ β, + ∞ ))) is closed.Finally, notice thatˆ ϕ − (( α, β )) = ˆ ϕ − (( α, + ∞ )) \ ˆ ϕ − ([ β, + ∞ )) = I + (cid:0) ϕ − (( α, + ∞ )) (cid:1) \ J + (cid:0) ϕ − ([ β, + ∞ )) (cid:1) which proves that ˆ ϕ − (( α, β )) is an open set.We have thus shown that ˆ ϕ ∈ C b ( M ). By 2 • we have that Z M ˆ ϕ dµ ≤ Z M ˆ ϕ dν. (30)But from (30) we readily obtain (28), because Z M ϕ dµ ≤ Z M ˆ ϕ dµ ≤ Z M ˆ ϕ dν ≤ Z M ψ dν, We are using the fact that in causally simple spacetimes J ± ( p ) are closed sets for all p ∈ M . • under the assumption that ϕ is compactly supportedand both ϕ and ψ are nonnegative. Let us now take any ϕ, ψ ∈ C b ( M ) satisfying (27).Deﬁne m := min { inf ϕ, inf ψ } and introduce ϕ m , ψ m ∈ C b ( M ) as ϕ m := ϕ − m and ψ m := ψ − m . Of course ϕ m , ψ m ≥ K n ) n ∈ N be an exhaustion of M by compact sets. Using Urysohn’s lemma for LCHspaces (Theorem 1), we construct a sequence ( θ n ) n ∈ N ⊆ C c ( M ) of functions such that, forany n ∈ N , θ n | K n ≡ ≤ θ n ≤ n ∈ N ) the function θ n ϕ m is compactly supported and, togetherwith ψ m , they are nonnegative and satisfy (27), because for all p, q ∈ M such that p (cid:22) q one has θ n ( p ) ϕ m ( p ) ≤ ϕ m ( p ) = ϕ ( p ) − m ≤ ψ ( q ) − m = ψ m ( q ) . On the strength of the previous part of the proof, it is then true that Z M θ n ϕ m dµ ≤ Z M ψ m dν. (31)By the very deﬁnition, θ n ≤ n and, since ( K n ) n ∈ N exhausts M , we havethat θ n → n → + ∞ in (31) obtaining Z M ϕ m dµ ≤ Z M ψ m dν. This, in turn, yields Z M ( ϕ ( p ) − m ) dµ ( p ) ≤ Z M ( ψ ( q ) − m ) dν ( q ) , which, by the fact that µ, ν are probability measures, simpliﬁes to Z M ϕ dµ ≤ Z M ψ dν and the proof of 6 • is complete.(6 • ⇒ • ) We will use one of the classical results in the optimal transport theory,concerning what is known as the Kantorovich duality. Concretely, we need the followingresult adapted from [43, Theorem 1.3]. Theorem 9. (Kantorovich duality) Let ( X , µ ) and ( X , µ ) be two Polish probabilityspaces and let c : X × X → R ≥ ∪ { + ∞} be a lower semi-continuous function. Then min π ∈ Π( µ ,µ ) Z X ×X c dπ = sup ( ϕ,ψ ) ∈ Ψ( µ ,µ ) Z X ϕ dµ − Z X ψ dµ  , (32) where Π( µ , µ ) := { π ∈ P ( X × X ) | (pr i ) ∗ π = µ i , i = 1 , } , • Ψ( µ , µ ) := { ( ϕ, ψ ) ∈ C b ( X ) × C b ( X ) | ∀ x ∈ X ∀ y ∈ X ϕ ( x ) − ψ ( y ) ≤ c ( x, y ) } . Let us apply the above theorem to the setting in which ( X , µ ) := ( M , µ ), ( X , µ ) :=( M , ν ) and c : M → R ≥ ∪ { + ∞} is deﬁned as c ( p, q ) = (cid:26) p (cid:22) q + ∞ if p q · The assumptions of Theorem 9 are met. M is a Polish space (cf. Section 2.1), whereasthe function c is lower semi-continuous, because the causal simplicity of M implies that J + is a closed subset of M .Notice that in the above settingΨ( µ, ν ) = { ( ϕ, ψ ) ∈ C b ( M ) × C b ( M ) | ∀ p, q ∈ M p (cid:22) q ⇒ ϕ ( p ) − ψ ( q ) ≤ } . In other words, Ψ( µ, ν ) is the set of exactly these pairs of functions which satisfy the as-sumptions of condition 6 • . Since we assume that 6 • holds, we obtain that ∀ ( ϕ, ψ ) ∈ Ψ( µ, ν ) Z M ϕ dµ − Z M ψ dν ≤ ( ϕ,ψ ) ∈ Ψ( µ,ν ) Z M ϕ dµ − Z M ψ dν  ≤ . Using the Kantorovich duality (32), we thus obtain thatmin π ∈ Π( µ,ν ) Z M c ( p, q ) dπ ( p, q ) ≤ . In particular, there exists at least one ω ∈ Π( µ, ν ) such that the integral above is ﬁnite .But, by the very deﬁnition of the function c , this is possible iﬀ ω ( M \ J + ) = 0 or, equiv-alently, iﬀ ω ( J + ) = 1. Thus, we have proven the existence of a measure ω with desiredproperties.(7 • ⇒ • ) Let f ∈ C b ( M ) be a causal function. Because the probability measures µ and ν are, respectively, left and right marginals of the joint distribution ω , one can writethat Z M f ( p ) dµ ( p ) = Z M f ( p ) dω ( p, q ) = Z J + f ( p ) dω ( p, q ) ≤ Z J + f ( q ) dω ( p, q ) = Z M f ( q ) dω ( p, q ) = Z M f ( q ) dν ( q ) , where the inequality follows from the causality of f . In the integrals with respect to ω wecan always switch between M and J + because ω ( M \ J + ) = 1 − ω ( J + ) = 0.The fourth result concerns globally hyperbolic spacetimes. It provides an additionalcharacterisation of causality in terms of Cauchy hypersurfaces.25 heorem 10. Let M be a globally hyperbolic spacetime. Conditions • – • are equivalentto the following condition8 • For every Cauchy hypersurface S ⊆ M µ ( J + ( S )) ≤ ν ( J + ( S )) . (33) Proof : (5 • ⇒ • ) Trivial.(8 • ⇒ • ) Let T : M → R be a smooth temporal function whose every level set isa Cauchy hypersurface.Take any compact subset K ⊆ M . Let T denote the minimal value attained at K by the function T . For any n ∈ N deﬁne the level set S n := T − ( T + n ). Every S n isa smooth spacelike Cauchy hypersurface. Now, for any n ∈ N consider the setΣ n := ∂J + ( S n ∪ K ) . PSfrag replacements K J + ( K ) S = Σ S S Σ Σ Figure 3: The construction of Σ n ’s.We claim that for every n ∈ N , Σ n is a Cauchy hypersurface and that J + (Σ n ) = J + ( S n ∪ K ) . (34)Indeed, observe ﬁrst that J + ( S n ∪ K ) is a future set. By [33, Chapter 14, Corollary 27]Σ n is therefore a closed achronal topological hypersurface. Let γ be any inextendibletimelike curve. It crosses the Cauchy hypersurfaces S n (which is contained in J + ( S n ∪ K ))and S (the past of which, I − ( S ), is disjoint with J + ( S n ∪ K )), therefore it must crossthe boundary ∂J + ( S n ∪ K ) = Σ n . Since the latter is achronal, it is met by γ exactly onceand therefore Σ n is a Cauchy hypersurface.In order to obtain (34), we prove the following lemma. Lemma 1.

Let M be a spacetime and let F ⊆ M be a closed future set such that

F ⊆ J + ( X ) for some achronal set X . Then J + ( ∂ F ) = F . This includes nonsmooth and nonspacelike ones (considered Cauchy surfaces must be achronal, butneed not be acausal). roof : ‘ ⊆ ’ Because F is closed, it contains its boundary: ∂ F ⊆ F . Hence J + ( ∂ F ) ⊆ J + ( F ) = F , because F is a future set.‘ ⊇ ’ Take q ∈ F . By assumption, there exists x ∈ X and a future-directed causal curve γ connecting x with q .Notice, ﬁrst, that x

6∈ F \ ∂ F = int F . Indeed, if x would belong to int F , which isan open subset of F , there would exist x ′ ∈ F such that x ′ ≪ x . But since F ⊆ J + ( X ),there would exist x ′′ ∈ X such that x ′′ (cid:22) x ′ . Altogether, by (5) we would obtain that x ′′ ≪ x , in contradiction with the achronality of X . Therefore, either x ∈ ∂ F or x ∈ M\F .If x ∈ ∂ F , then q ∈ J + ( ∂ F ) and the proof is complete.On the other hand, if x ∈ M \ F , then the curve γ must cross ∂ F at some point p . Ofcourse, p (cid:22) q and hence also in this case q ∈ J + ( ∂ F ).Notice now that J + ( S n ∪ K ) = J + ( S n ) ∪ J + ( K ) is in fact a closed future set such that J + ( S n ∪ K ) ⊆ J + ( S ). On the strength of Lemma 1, we obtain (34).By 8 • , because Σ n is a Cauchy hypersurface for any n ∈ N , we can write that µ (cid:0) J + (Σ n ) (cid:1) ≤ ν (cid:0) J + (Σ n ) (cid:1) . (35)Observe that the sequence ( J + (Σ n )) n ∈ N is decreasing, because for all n ∈ N J + (Σ n +1 ) = J + ( S n +1 ∪ K ) = J + ( S n +1 ) ∪ J + ( K )= T − ([ T + n + 1 , + ∞ )) ∪ J + ( K ) ⊆ T − ([ T + n, + ∞ )) ∪ J + ( K ) = J + (Σ n ) , where we have used (34) and the very deﬁnition of S n ’s. Property (2) allows us to passwith n → + ∞ in (35) and write that µ ∞ \ n =0 J + (Σ n ) ! ≤ ν ∞ \ n =0 J + (Σ n ) ! . (36)The countable intersection appearing above can be easily shown to be equal to J + ( K ).Indeed, one has ∞ \ n =0 J + (Σ n ) = J + ( K ) ∪ ∞ \ n =0 J + ( S n ) = J + ( K ) ∪ ∞ \ n =0 T − ([ T + n, + ∞ ))= J + ( K ) ∪ T − (cid:16) ∞ \ n =0 [ T + n, + ∞ ) | {z } = ∅ (cid:17) = J + ( K ) . Therefore, (36) yields (21) and the proof of 4 • is complete.We have thus provided 8 diﬀerent characterisations of a causal relation between prob-ability measures, which are equivalent if the underlying spacetime is globally hyperbolic.Some of the implications hold under lower causality conditions, as demonstrated in Theo-rems 6 – 8. Let us now discuss other implications not covered in the proofs. For the closedness of J + ( S n ), we refer e.g. to [37, Section 10.2.7]. The closedness of J + ( K ), onthe other hand, follows from the causal simplicity of M . emark 1. Let us ﬁrst stress that the formulation of conditions 3 • –5 • using the futureof a set is just a matter of convention and one could equally well employ the pasts. Con-cretely, straightforward application of the time inversion (note that such operation changesthe relation (cid:22) into the opposite one) shows that conditions 3 • , • , • are (in any spacetime M ) equivalent to the following conditions, respectively:3 ′• For every open past set

P ⊆ M µ ( P ) ≥ ν ( P ) . (37)4 ′• For every compact

K ⊆ M µ ( J − ( K )) ≥ ν ( J − ( K )) . (38)5 ′• For every Borel past set

P ⊆ M µ ( P ) ≥ ν ( P ) . (39) Remark 2.

Clearly, the proof of the implication 7 • ⇒ • uses neither the causal simplicityof M nor the boundedness of the function f . In fact, it works for any spacetime and for any µ - and ν -integrable causal function. We can, therefore, write down the following condition2 ′• For every causal f ∈ L ( M , µ ) ∩ L ( M , ν ), Z M f dµ ≤ Z M f dν. (40)For any spacetime M it is then true that 7 • ⇒ ′• as well as, trivially, 2 ′• ⇒ • ⇒ • . Remark 3.

Condition 2 ′• implies 5 • in any spacetime M . Proof :

Let F be a Borel future subset of M . Clearly, F ∈ L ( M , µ ) ∩ L ( M , ν ) and,by Corollary 1, it is a causal function. By condition 2 ′• we can write µ ( F ) = Z M F dµ ≤ Z M F dν = ν ( F ) , what proves 5 • . Remark 4.

Also the implication 7 • ⇒ • holds in all spacetimes. We can show evenslightly more, namely, that condition 7 • implies6 ′• For all ϕ, ψ : M → R such that ϕ is µ -integrable and ψ is ν -integrable (cid:2) ∀ p, q ∈ M p (cid:22) q ⇒ ϕ ( p ) ≤ ψ ( q ) (cid:3) ⇒ Z M ϕ dµ ≤ Z M ψ dν. (41) Proof :

Similarly as in the proof of 7 • ⇒ • , one can write that Z M ϕ ( p ) dµ ( p ) = Z M ϕ ( p ) dω ( p, q ) = Z J + ϕ ( p ) dω ( p, q ) ≤ Z J + ψ ( q ) dω ( p, q ) = Z M ψ ( q ) dω ( p, q ) = Z M ψ ( q ) dν ( q ) , where the inequality follows from the assumptions on ϕ and ψ .28 .2 Basic properties of the causal relation between measures In the previous subsection we have shown that for any spacetime M the condition 7 • notonly implies all of the others listed in Theorems 6, 7, 8 and 10, but also more generalones 2 ′• and 6 ′• . It encourages us to promote the condition 7 • to a deﬁnition of the causalprecedence relation on P ( M ) for any spacetime M . Deﬁnition 2.

Let M be a spacetime. For any µ, ν ∈ P ( M ) we say that µ causallyprecedes ν (symbolically µ (cid:22) ν ) iﬀ there exists ω ∈ P ( M ) such thati) (pr ) ∗ ω = µ and (pr ) ∗ ω = ν ,ii) ω ( J + ) = 1 .Such an ω will be called a causal coupling of µ and ν . Observe that ω ( J + ) is well-deﬁned because, by Theorem 4, J + is σ -compact, and henceBorel, for any spacetime M . Remark 5.

In the case of causally simple spacetimes J + ⊆ M is closed and therefore,by the very deﬁnition of the support of a measure (see the last paragraph of Section 2.2),condition ii) in Deﬁnition 2 is equivalent to the inclusion supp ω ⊆ J + . However, withoutthe assumption of causal simplicity this is no longer true.The term ‘coupling (of measures µ and ν )’ comes from the optimal transport theory [43],where it describes any ω ∈ P ( M ) with property i) of the above deﬁnition. The set of suchcouplings, denoted Π( µ, ν ), has already appeared above in the context of the Kantorovichduality (Theorem 9).Such a coupling — or a transference plan , as it is also called — can be regarded asan instruction how to ‘reconﬁgure’ a ﬁxed amount of ‘mass’ distributed over M accordingto the measure µ so that it becomes distributed according to the measure ν . This ‘re-conﬁguration’ involves transporting the (possibly inﬁnitesimal) portions of ‘mass’ betweenpoints of M , and a coupling ω ∈ Π( µ, ν ) ⊆ P ( M ) precisely describes what amount of‘mass’ is transported between any given pair of points.It is, however, property ii) which ties the above deﬁnition with the causality theory.It can be summarised as a requirement that the transport of ‘mass’ be conducted alongfuture-directed causal curves only — that is why such couplings deserve to be called causal .The set of all causal couplings of measures µ and ν will be denoted by Π c ( µ, ν ).Notice that a (causal) coupling does not specify along which (causal) curves the portionsof ‘mass’ are transported. In fact, various families of (causal) curves can lead to the same(causal) coupling. Notice also that the ‘mass’ concentrated initially at some point p ∈ M can dilute to many diﬀerent points.Observe that for Dirac measures µ = δ p , ν = δ q Deﬁnition 2 reduces to the standarddeﬁnition of the causal relation between events p and q . This can be seen as a corollary ofthe following proposition. Proposition 4.

Let M be a topological space and let µ, ν ∈ P ( M ) and ω ∈ Π( µ, ν ).Then, for any Borel sets A, B ⊆ M i) µ ( A ) = ν ( B ) = 1 ⇔ ω ( A × B ) = 1,ii) µ ( A ) = 0 ∨ ν ( B ) = 0 ⇒ ω ( A × B ) = 0.29 roof : i) To prove ‘ ⇒ ’ we use the inclusion–exclusion principle to write1 ≥ ω ( A × B ) = ω ( A × M ∩ M × B ) = ω ( A × M ) | {z } = µ ( A ) = 1 + ω ( M × B ) | {z } = ν ( B ) = 1 − ω ( A × M ∪ M × B ) | {z } ≤ ≥ − . Conversely, to prove ‘ ⇐ ’, notice that1 ≥ µ ( A ) = ω ( A × M ) ≥ ω ( A × B ) = 1 and 1 ≥ ν ( B ) = ω ( M × B ) ≥ ω ( A × B ) = 1 . ii) One has0 ≤ ω ( A × B ) ≤ min { ω ( A × M ) , ω ( M × B ) } = min { µ ( A ) , ν ( B ) } = 0 . Corollary 5.

Let M be a spacetime. Then for any p, q ∈ M p (cid:22) q iﬀ δ p (cid:22) δ q . Proof :

By Proposition 4, the only coupling between two Dirac measures δ p , δ q is theirproduct measure ω := δ p × δ q = δ ( p,q ) . Hence, the fact that p (cid:22) q is equivalent in this caseto the requirement that ω ( J + ) = 1. Corollary 6.

Let M be a causally simple spacetime. For any p, q ∈ M the followingconditions are equivalent1 ⋄ ∀ f ∈ C ( M ) f ( p ) ≤ f ( q ),2 ⋄ p (cid:22) q . Proof :

It is a direct consequence of the equivalence (1 • ⇒ • ) in Theorem 8 andCorollary 5.If the measure µ is compactly supported, then in the light of the above discussion itis natural to expect that the support of any ν with µ (cid:22) ν should be within the future ofsupp µ [45]. This intuitive condition in fact true in causally simple spacetimes. Proposition 5.

Let M be a spacetime and let µ, ν ∈ P ( M ), with µ compactly supportedand µ (cid:22) ν . Then ν ( J + (supp µ )) = 1. Moreover, if M is causally simple then supp ν ⊆ J + (supp µ ). Proof :

By condition 4 • (which is implied by Deﬁnition 2) it is true that1 = µ (supp µ ) ≤ µ ( J + (supp µ )) ≤ ν ( J + (supp µ )) ≤ , and therefore ν ( J + (supp µ )) = 1.We now claim that if M is causally simple, then this implies that supp ν ⊆ J + (supp µ ).Indeed, recall that in a causally simple spacetime the causal futures of compact setsare closed. Therefore, if there existed q ∈ supp ν but q J + (supp µ ), then we could takean open neighborhood U ∋ q such that ν ( U ) > U ∩ J + (supp µ ) = ∅ . But this wouldimply that ν ( J + (supp µ )) ≤ − ν ( U ) < , in contradiction with the ﬁrst part of the proof.30ecall that the causal precedence relation between events is reﬂexive, transitive and,iﬀ M is causal, antisymmetric. We now prove analogous results for the space of Borelprobability measures on M equipped with the relation (cid:22) . To this end, it will be convenientto use the diagonal function ∆ : M → M , deﬁned as ∆( p ) := ( p, p ) for any p ∈ M . Theorem 11.

Let M be a spacetime. The relation (cid:22) on P ( M ) is reﬂexive and transitive. Proof :

To prove reﬂexivity of (cid:22) , it suﬃces to notice that for any µ ∈ P ( M ) the push-forward measure ∆ ∗ µ is a causal coupling of µ with itself.Indeed, (pr i ) ∗ ∆ ∗ µ = (pr i ◦ ∆) ∗ µ = id ∗ µ = µ for i = 1 , ∗ µ ( J + ) = µ (∆ − ( J + )) = µ ( M ) = 1, where we have used the equality ∆ − ( J + ) = M , which expresses nothing butthe reﬂexivity of the causal precedence relation between events .We now move to proving the transitivity of (cid:22) . Let us invoke the following standardresult [43, Lemma 7.6] from the optimal transport theory. Lemma 2. (Gluing Lemma)

Let ( X i , µ i ) , i = 1 , , be Polish probability spaces andassume there exist couplings ω ∈ Π( µ , µ ) and ω ∈ Π( µ , µ ) .Then, there exists ω ∈ P ( X ×X ×X ) such that (pr ) ∗ ω = ω and (pr ) ∗ ω = ω , where pr ij : X × X × X → X i × X j denotes the canonical projection map.Moreover, ω := (pr ) ∗ ω belongs to Π( µ , µ ) . The Gluing Lemma works well with the causal precedence relation. Concretely, letus take µ , µ , µ ∈ P ( M ) such that µ (cid:22) µ (cid:22) µ , where ω ∈ Π c ( µ , µ ) and ω ∈ Π c ( µ , µ ). Then the coupling ω of µ and µ is causal, too.Indeed, notice ﬁrst that ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q r (cid:9)(cid:1) ≤ ω (cid:0)(cid:8) ( p, q, r ) ∈ M | q r (cid:9)(cid:1) = ω (cid:0) M × (cid:0) M \ J + (cid:1)(cid:1) = ω (cid:0) M \ J + (cid:1) = 1 − ω ( J + ) = 0 , and thus ω ( { ( p, q, r ) ∈ M | p (cid:22) q r } ) = 0.On the other hand, ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p q (cid:9)(cid:1) = ω (cid:0)(cid:0) M \ J + (cid:1) × M (cid:1) = ω (cid:0) M \ J + (cid:1) = 1 − ω ( J + ) = 0 . Since M can be decomposed into the following union of (pairwise disjoint) sets M = (cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9) ∪ (cid:8) ( p, q, r ) ∈ M | p (cid:22) q r (cid:9) ∪ (cid:8) ( p, q, r ) ∈ M | p q (cid:9) , therefore we obtain1 = ω ( M ) = ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9)(cid:1) + ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q r (cid:9)(cid:1)| {z } = 0 + ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p q (cid:9)(cid:1)| {z } = 0 and hence ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9)(cid:1) = 1 . (42)31ut this, in turn, means that1 ≥ ω ( J + ) = ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) r (cid:9)(cid:1) ≥ ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9)(cid:1) = 1 , where the middle inequality is a direct consequence of the transitivity of the causal prece-dence relation between events. We have thus proven that ω ( J + ) = 1, and so ω ∈ Π c ( µ , µ ) and therefore µ (cid:22) µ .The natural question arises: how robust the causal structure of a spacetime M mustbe to render the relation (cid:22) antisymmetric and hence a partial order? Obviously, M mustbe at least causal (otherwise even the causal precedence relation between events fails to beantisymmetric).We have the following partial result. Theorem 12.

Let M be a spacetime with the following property:For any compact K ⊆ M there exists a Borel function τ K : K → R such that ∀ p, q ∈ K p (cid:22) q ⇒ τ K ( p ) < τ K ( q ) . (43) Then, for any µ ∈ P ( M ) Π c ( µ, µ ) = { ∆ ∗ µ } . Moreover, the relation (cid:22) is antisymmetric. Remark 6.

Property (43) implies that M is causal. Indeed, suppose that there exist twodistinct events p, q ∈ M such that p (cid:22) q (cid:22) p . Taking now K = { p, q } , on the strength of(43) we would obtain that τ K ( p ) < τ K ( q ) < τ K ( p ), a contradiction.On the other hand, if M is past (future) distinguishing, then any past (resp. future)volume function is a semi-continuous, and hence Borel, generalised time function (cf. Sec-tion 2.3). This obviously implies (43) — for any compact K ⊆ M simply deﬁne τ K := τ | K .However, being past or future distinguishing is not necessary for (43) to hold. Indeed,the rightmost diagram in [31, Figure 6] presents a causal, but neither future nor pastdistinguishing spacetime M := R × S \ { (0 , } , which admits a Borel generalised timefunction, for instance τ ( x, θ ) :=  arctan x for x < θ for x = 02 π + arctan x for x > , for any x ∈ R and θ ∈ S , where the latter is the angular coordinate whose range is [0 , π ),except for x = 0, when its range is (0 , π ).Before we move to the proof of Theorem 12, let us present the following lemma. Lemma 3.

Let M be a topological space and let µ, ν be two Borel probability measures on M . Finally, let ω ∈ Π( µ, ν ) be such that ω (∆( M )) = 1 . Then µ = ν and ω = ∆ ∗ µ = ∆ ∗ ν . Proof :

Let U be any Borel subset of M . Then, ω ( U \ ∆( M )) ≤ ω ( M \ ∆( M )) =1 − ω (∆( M )) = 0 and therefore ω ( U \ ∆( M )) = 0. But this allows us to write ω ( U ) = ω ( U ∩ ∆( M )) + ω ( U \ ∆( M )) | {z } = 0 = ω (cid:0) ∆(∆ − ( U )) (cid:1) . The rightmost expression, in turn, can be further transformed either into ω (cid:0) ∆(∆ − ( U )) (cid:1) = ω (cid:0) (∆ − ( U ) × M ) ∩ ∆( M ) (cid:1) = ω (cid:0) ∆ − ( U ) × M (cid:1) − ω (cid:0) (∆ − ( U ) × M ) \ ∆( M ) (cid:1)| {z } = 0 = µ (cid:0) ∆ − ( U ) (cid:1) = ∆ ∗ µ ( U )32r into ω (cid:0) ∆(∆ − ( U )) (cid:1) = ω (cid:0) ( M × ∆ − ( U )) ∩ ∆( M ) (cid:1) = ω (cid:0) M × ∆ − ( U ) (cid:1) − ω (cid:0) M × (∆ − ( U )) \ ∆( M ) (cid:1)| {z } = 0 = ν (cid:0) ∆ − ( U ) (cid:1) = ∆ ∗ ν ( U ) , what proves the second part of the theorem. To obtain the equality µ = ν , take any Borel V ⊆ M and notice, for instance, that ν ( V ) = ω ( M × V ) = ∆ ∗ µ ( M × V ) = µ (cid:0) ∆ − ( M × V ) (cid:1) = µ ( V ) , which concludes the entire proof. Proof of Theorem 12:

Take any µ ∈ P ( M ) and let π ∈ Π c ( µ, µ ). By Deﬁnition 2,we have that ∀ f ∈ L ( M , µ ) Z J + f ( p ) dπ ( p, q ) = Z M f dµ = Z J + f ( q ) dπ ( p, q )and hence ∀ f ∈ L ( M , µ ) Z J + ( f ( q ) − f ( p )) dπ ( p, q ) = 0or, by noticing that the integrand vanishes on ∆( M ), ∀ f ∈ L ( M , µ ) Z J + \ ∆( M ) ( f ( q ) − f ( p )) dπ ( p, q ) = 0 . (44)Suppose now that π ( J + \ ∆( M )) >

0. Because π is tight, there exists a compact set K ⊆ J + \ ∆( M ) with π ( K ) >

0. Notice that K ⊆ K , where K := pr K ∪ pr K isa compact subset of M , and so π ( K ∩ J + \ ∆( M )) >

0. Deﬁne f K : M → R via f K ( p ) := (cid:26) arctan τ K ( p ) for p ∈ K p

6∈ K , where τ K is a function whose existence is guaranteed by property (43). Function f K isBorel and bounded, and hence µ -integrable. Plugging it into (44) yields Z K ∩ J + \ ∆( M ) (arctan τ K ( q ) − arctan τ K ( p )) dπ ( p, q ) = 0 . But the integrand of the above integral is positive on K ∩ J + \ ∆( M ) by the very deﬁnitionof τ K , therefore the fact that the integral is zero implies that π ( K ∩ J + \ ∆( M )) = 0,which contradicts the earlier result. This proves that π ( J + \ ∆( M )) = 0.By property ii ) from Deﬁnition 2, this in turn means that π (∆( M )) = π ( J + ) − π ( J + \ ∆( M )) = 1 . On the strength of Lemma 3, we get that π = ∆ ∗ µ .33e now move to proving the antisymmetricity of the relation (cid:22) . Let µ, ν ∈ P ( M ) besuch that µ (cid:22) ν (cid:22) µ . Let ω ∈ Π c ( µ, ν ) and ̟ ∈ Π c ( ν, µ ). By the Gluing Lemma, thereexists Ω ∈ P ( M ) such that (pr ) ∗ Ω = ω , (pr ) ∗ Ω = ̟ and (pr ) ∗ Ω ∈ Π c ( µ, µ ), which,by the previous part of the proof, means that (pr ) ∗ Ω = ∆ ∗ µ .Formula (42) takes here the following formΩ (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9)(cid:1) = 1 . Notice, however, that the set { ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r = p } is Ω-null, becauseΩ (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r = p (cid:9)(cid:1) ≤ Ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p = r (cid:9)(cid:1) = (pr ) ∗ Ω (cid:0)(cid:8) ( p, r ) ∈ M | p = r (cid:9)(cid:1) = ∆ ∗ µ (cid:0) M \ ∆( M ) (cid:1) = 1 − µ ( M ) = 0 . Therefore, in fact,Ω (cid:0)(cid:8) ( p, q, p ) ∈ M | p (cid:22) q (cid:22) p (cid:9)(cid:1) (45)= Ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r (cid:9)(cid:1)| {z } = 1 − Ω (cid:0)(cid:8) ( p, q, r ) ∈ M | p (cid:22) q (cid:22) r = p (cid:9)(cid:1)| {z } = 0 = 1 . But M is causal (cf. Remark 6), therefore the causal precedence relation between events isantisymmetric and thus the set whose measure is evaluated in (45) is equal to { ( p, p, p ) ∈M | p ∈ M} .We can now easily obtain that ω (∆( M )) = Ω(∆( M ) × M ) = Ω (cid:0) { ( p, p, q ) ∈ M | p, q ∈ M} (cid:1) ≥ Ω (cid:0) { ( p, p, p ) ∈ M | p ∈ M} (cid:1) = 1and so ω (∆( M )) = 1. Invoking Lemma 3, we obtain that µ = ν . Recall that the Lorentzian distance d : M → [0 , + ∞ ] provides a physically meaningfulway of measuring distances between events, in an analogy with the Riemannian distance d R in the case of Riemannian manifolds. In the latter case, one can extend the notion ofa distance to the space of measures on M . Concretely, for any s ≥ s th Wasserstein distance between any two measures µ, ν ∈ P ( R ) on a Riemannianmanifold R as W s ( µ, ν ) := inf π ∈ Π( µ,ν ) Z R d R ( x, y ) s dπ ( x, y )  /s . (46)For an exposition of the theory of Wasserstein distances in the context of the optimaltransport theory one is referred e.g. to [43].We now propose the following natural deﬁnition of a distance between measures ona spacetime. 34 eﬁnition 3. Let M be a spacetime and let s ∈ (0 , . The s th Lorentz–Wassersteindistance is the map LW s : P ( M ) × P ( M ) → [0 , + ∞ ] given by LW s ( µ, ν ) :=  sup ω ∈ Π c ( µ,ν ) " R M d ( p, q ) s dω ( p, q ) /s if Π c ( µ, ν ) = ∅ c ( µ, ν ) = ∅ · Notice that the integrals are well-deﬁned, because d is lower semi-continuous and henceBorel. Notice also that for Dirac measures LW s ( δ p , δ q ) = d ( p, q ) for any s .Lorentz–Wasserstein distances have properties analogous to those of the Lorentziandistance (cf. Section 2.3). Theorem 13.

Let M be a spacetime and let s ∈ (0 , . Then:i) For any µ, ν ∈ P ( M ) LW s ( µ, ν ) > ⇔ ∃ ω ∈ Π c ( µ, ν ) ω ( I + ) > ⇒ µ (cid:22) ν. ii) The reverse triangle inequality holds. Namely, for any µ , µ , µ ∈ P ( M ) µ (cid:22) µ (cid:22) µ ⇒ LW s ( µ , µ ) + LW s ( µ , µ ) ≤ LW s ( µ , µ ) . (47) iii) For any µ ∈ P ( M ) , LW s ( µ, µ ) is either or + ∞ .iv) M is chronological iﬀ ∀ µ ∈ P ( M ) LW s ( µ, µ ) = 0 .v) For any µ, ν ∈ P ( M ) , if LW s ( µ, ν ) ∈ (0 , + ∞ ) then LW s ( ν, µ ) = 0 . Proof : i) The implication is obvious, so we only prove the equivalence.To prove the ‘ ⇒ ’ part of the equivalence, assume that LW s ( µ, ν ) >

0. By the verydeﬁnition of LW s , this implies that there exists ω ∈ Π c ( µ, ν ) such that R M d ( p, q ) s dω ( p, q ) >

0. In order to prove that ω ( I + ) >

0, suppose on the contrary that I + is ω -null. Then, onewould have0 < Z M d ( p, q ) s dω ( p, q ) = Z J + d ( p, q ) s dω ( p, q ) = Z E + d ( p, q ) s dω ( p, q ) | {z } = 0, because d vanishes on E + + Z I + d ( p, q ) s dω ( p, q ) | {z } = 0, because ω ( I + ) = 0 = 0 , hence a contradiction.To prove the ‘ ⇐ ’ part, suppose there exists ω ∈ Π c ( µ, ν ) with ω ( I + ) >

0, but never-theless LW s ( µ, ν ) = 0. The latter implies that R M d ( p, q ) s dω ( p, q ) = 0. But this, in turn,means that Z I + d ( p, q ) s dω ( p, q ) = Z J + d ( p, q ) s dω ( p, q ) − Z E + d ( p, q ) s dω ( p, q ) | {z } = 0, because d vanishes on E + = Z M d ( p, q ) s dω ( p, q ) = 0 . But d is positive on I + and so the latter must be an ω -null set, which contradicts withthe assumption that ω ( I + ) >

0. 35 i) Let µ , µ , µ ∈ P ( M ) satisfy µ (cid:22) µ (cid:22) µ . Let ω and ω be any elements ofΠ c ( µ , µ ) and Π c ( µ , µ ), respectively, and let ω ∈ P ( M ) be a measure ‘gluing themtogether’ as speciﬁed in the Gluing Lemma. Recall from the discussion following thatlemma that ω := (pr ) ∗ ω ∈ Π c ( µ , µ ).One has the inequality LW s ( µ , µ ) ≥  Z M d ( p, q ) s dω ( p, q )  /s +  Z M d ( q, r ) s dω ( q, r )  /s , (48)which is proven through the following sequence of equalities and inequalities. LW s ( µ , µ ) ≥  Z M d ( p, r ) s dω ( p, r )  /s =  Z M d ( p, r ) s dω ( p, q, r )  /s ≥  Z M ( d ( p, q ) + d ( q, r )) s dω ( p, q, r )  /s ≥  Z M d ( p, q ) s dω ( p, q, r )  /s +  Z M d ( q, r ) s dω ( p, q, r )  /s =  Z M d ( p, q ) s dω ( p, q )  /s +  Z M d ( q, r ) s dω ( q, r )  /s , where we have used, successively, the deﬁnition of LW s , the Gluing Lemma (the deﬁnitionof ω ), the reverse triangle inequality for d , the reverse Minkowski inequality for integrals[20, Proposition 5.3.1] and, ﬁnally, the Gluing Lemma again (the deﬁnition of ω ).By the arbitrariness of ω ∈ Π c ( µ , µ ) and ω ∈ Π c ( µ , µ ), inequality (48) immedi-ately yields (47) — one simply has to take the supremum over all ω ∈ Π c ( µ , µ ) and all ω ∈ Π c ( µ , µ ). iii) By ii) and the fact that µ (cid:22) µ , one has 2 LW s ( µ, µ ) ≤ LW s ( µ, µ ), which is true iﬀeither LW s ( µ, µ ) = 0 or LW s ( µ, µ ) = + ∞ . iv) To prove ‘ ⇒ ’, assume that M is chronological. By i) , it suﬃces to show that forany µ ∈ P ( M ) and for any ω ∈ Π c ( µ, µ ) we must have ω ( I + ) = 0.Indeed, proceeding identically as in the beginning of the proof of Theorem 12, we obtain(compare with (44)) ∀ f ∈ L ( M , µ ) Z J + \ ∆( M ) ( f ( q ) − f ( p )) dω ( p, q ) = 0 . (49)The key now is to use a past volume function t − associated to some admissible measure on M . Recall that t − is causal. Moreover, since M is chronological, t − is increasing on anyfuture-directed timelike curve (cf. Section 2.3). Symbolically: ∀ ( p, q ) ∈ J + t − ( p ) ≤ t − ( q ) and ∀ ( p, q ) ∈ I + t − ( p ) < t − ( q ) . (50)36ubstituting f := t − in (49) (recall that t − is Borel and bounded and hence µ -integrable),we can write Z E + \ ∆( M ) (cid:0) t − ( q ) − t − ( p ) (cid:1) dω ( p, q ) + Z I + (cid:0) t − ( q ) − t − ( p ) (cid:1) dω ( p, q ) = 0 . (51)By the ﬁrst property in (50), both integrals in (51) are nonnegative and hence they bothmust vanish. However, by the second property in (50), the integrand in the rightmostintegral is positive on I + , therefore this integral cannot vanish unless ω ( I + ) = 0.The proof of ‘ ⇐ ’ is straightforward. Take any p ∈ M and notice that, by assumption, d ( p, p ) = LW s ( δ p , δ p ) = 0 . But this implies (see property i) of the Lorentzian distance in Section 2.3) that p p forany p ∈ M , which means that M is chronological. v) Suppose that LW s ( µ, ν ) ∈ (0 , + ∞ ) but, nevertheless, LW s ( ν, µ ) >

0. By i) , thisimplies that µ (cid:22) ν (cid:22) µ . By ii) , we can write that0 < LW s ( µ, ν ) + LW s ( ν, µ ) ≤ LW s ( µ, µ ) . On the other hand, again by ii) , it is also true that LW s ( µ, µ ) + LW s ( µ, ν ) ≤ LW s ( µ, ν ) , which, since LW s ( µ, ν ) is assumed ﬁnite, implies that LW s ( µ, µ ) ≤ Example 1.

Consider the (1 + 1)-dimensional Minkowski spacetime M := R , and ﬁx s ∈ (0 , µ := δ (0 , and ν := ∞ P i =1 − i δ (2 i/s , . Deﬁne ω ∈ Π c ( µ, ν ) by ω := ∞ X i =1 − i δ (0 , × δ (2 i/s , and therefore LW s ( µ, ν ) ≥  Z M d s dω  s = " ∞ X i =1 d (cid:0) (0 , , (2 i/s , (cid:1) s − i s = " ∞ X i =1 (cid:0) i/s (cid:1) s − i s = " ∞ X i =1 s = + ∞ . However, Lorentz–Wasserstein distances between two compactly supported measuresin globally hyperbolic spacetimes are ﬁnite. In fact, it is the only causal coupling between those particular µ and ν . roposition 6. Let M be a globally hyperbolic spacetime, s ∈ (0 ,

1] and let µ, ν ∈ P ( M )be compactly supported. Then, LW s ( µ, ν ) < + ∞ . Proof

If Π c ( µ, ν ) = ∅ , then trivially LW s ( µ, ν ) = 0 < + ∞ . Assume then that the setof causal couplings between µ and ν is nonempty and take any ω ∈ LW s ( µ, ν ). Onthe strength of Proposition 4, ω (supp µ × supp ν ) = 1. By assumption, the set supp µ × supp ν ⊆ M is compact. Moreover, by the global hyperbolicity of M , d is a continuousmap and hence it is bounded on that compact set. Therefore, Z M d ( p, q ) s dω ( p, q ) = Z supp µ × supp ν d ( p, q ) s dω ( p, q ) + Z M \ (supp µ × supp ν ) d ( p, q ) s dω ( p, q ) | {z } = 0 , because the domain of integration is ω -null ≤ max p ∈ supp µq ∈ supp ν d ( p, q ) s Z supp µ × supp ν dω = " max p ∈ supp µq ∈ supp ν d ( p, q ) s and so, by the arbitrariness of ω , LW s ( µ, ν ) ≤ max p ∈ supp µq ∈ supp ν d ( p, q ) < + ∞ . Let us brieﬂy summarise the main results of the paper. We proposed a notion of a causalrelation between probability measures on a given spacetime M . To give sense to Deﬁnition2 embedded in the theory of optimal transport, we had to enter the domain on the vergeof causality and measure theory. We believe that our paper paves the way to this terraincognita , which is worth exploring both from the viewpoint of mathematical relativity, aswell as possible applications in quantum physics.On the mathematical side, the presented theory can be developed in various directions.Firstly, one can try to lower the causality conditions imposed on the spacetime inthe theorems presented in Section 4. In particular, it would be interesting to see whetherthe deﬁned relation on P ( M ) is a partial order for every causal spacetime M , or is the as-sumption (43) in Theorem 12 a necessary one. If the latter holds, one would obtain a newrung of the causal ladder between the causal and distinguishing spacetimes.A second path of possible development is to investigate further the notion of a Lorentziandistance in the space of probability measures on a spacetime, and the associated topologi-cal questions. In Section 5 we proposed a notion of the s th Lorentz–Wasserstein distance,which is a natural generalisation of the Lorentzian distance between the events on M .However, in the optimal transport theory there are other ways to measure distances be-tween probability measures (see for instance [44, p. 97]). It is tempting to see how (ifat all) these notions can be adapted to the spacetime framework. This directly relates tothe issue of topology on P ( M ) and its interplay with the semi-Riemannian metric on M .Another potential direction of future studies, particularly interesting from the viewpointof applications, would be to generalise the results of the present paper to signed measures.38his would allow to study causality of, both classical and quantum, charge (probability)densities on spacetimes.The applications of the developed theory in classical and quantum physics will bediscussed in details in a forthcoming paper. Let us, however, make some remarks here.Probability measures on space(time) arise in a natural way in quantum theory fromthe wave functions via the ‘modulus square principle’. The results of Hegerfeldt showthat in a generic quantum evolution driven by a Hamiltonian bounded from below a stateinitially localised in space immediately develops inﬁnite tails. If a quantum system isacausal in the sense of Hegerfeldt, then it is so in the sense of Deﬁnition 2. Indeed, a wavefunction is localised (that is of compact support) if and only if the corresponding probabilitymeasure is so. Thus, if µ ∈ P ( R n ) has compact support and µ t ∈ P ( R n ) extends to inﬁnityfor any t >

0, then δ × µ (cid:14) δ t × µ t as measures on the ( n + 1)-dimensional Minkowskispacetime on the strength of Proposition 5.Note however, that Proposition 5 provides only a necessary condition for a causalrelation to hold, and not a suﬃcient one. In [25], Hegerfeldt has extended his theorem toinitial states with exponentially bounded tails. He also suggested therein that a similarphenomenon resulting in the breakdown of causality should occur for states with powerlikedecay. It thus indicates that acausality is a property of the quantum system and cannotbe avoided by the use of nonlocal states. Our Deﬁnition 2 opens the door to check thisconjecture in a mathematically rigorous way.It is sometimes argued (see for instance [3, 4]) that Hegerfeldt’s theorem implies thatlocalised quantum states do not exist in Nature. This conclusion is however challenged bythe results in [29], which suggest that there is no lower limit on the localisation of the elec-tron. Moreover, the fact that a state is nonlocal does not necessarily cure the causalityviolation. Indeed, imagine that one disposes of an initial quantum state, localised or not,which undergoes an acausal evolution, i.e. δ × µ (cid:14) δ t × µ t for any t >

0. Then one couldencode information in the probability density of µ in some compact region K of space andtransmit it to an observer localised outside of J + ( { } × K ), as follows from the condition4 • . Such a method of signalling would have a very low eﬃciency, but is a priori possible –see for instance the discussion in [27] and other cited works by Hegerfeldt.Finally, let us come back to the original motivation of our preliminary Deﬁnition 1.As stressed at the beginning of Section 4, it was inspired by the notion of ‘causality inthe space of states’ coined in [17]. The partial order relation considered in [17] is deﬁnedon the space of states S ( A ) of a C ∗ -algebra A . If the algebra A is commutative then, byGelfand duality, there exists a locally compact Hausdorﬀ topological space M , such that A ≃ C ( M ). Then, the Riesz–Markov representation theorem implies that S ( A ) ≃ P ( M ).Hence, if M is a causally simple spacetime, then the two notions of ‘causality for Borelprobability measures’ and ‘causality in the space of states’ coincide.The concept of causality in the space of states was explored [18, 19] in the frameworkof ‘almost commutative spacetimes’, i.e. for C ∗ -algebras of the form C ( M ) ⊗ A F , with A F being a ﬁnite dimensional matrix algebra. However, the study therein was limited onlyto special subclasses of all states, nevertheless yielding interesting results. The theory putforward in the present paper blazes a trail to unravel the complete causal structure of almostcommutative spacetimes. Having in mind that almost commutative spacetimes are utilisedto build models in particle physics [42], it is enticing to see whether the extended causalstructure imposes any restrictions on probabilities that could be checked experimentally.39 eferences [1] A. Abdo et al. Testing Einstein’s special relativity with Fermi’s short hard γ -ray burstGRB090510. Nature , 462:331, 2009.[2] H. Aichmann and G. Nimtz. On the traversal time of barriers.

Foundations of Physics ,44(6):678–688, 2014.[3] M. Al-Hashimi and U.-J. Wiese. Minimal position–velocity uncertainty wave packets inrelativistic and non-relativistic quantum mechanics.

Annals of Physics , 324(12):2599– 2621, 2009.[4] N. Barat and J. Kimball. Localization and causality for a free particle.

Physics LettersA , 308(2–3):110–115, 2003.[5] D. Beckman, D. Gottesman, M. A. Nielsen, and J. Preskill. Causal and localizablequantum operations.

Physical Review A , 64:052309, 2001.[6] J. Beem, P. Ehrlich, and K. Easley.

Global Lorentzian Geometry , volume 202 of

Monographs and Textbooks in Pure and Applied Mathematics . CRC Press, 1996.[7] A. Bernal and M. S´anchez. Smoothness of time functions and the metric splitting ofglobally hyperbolic spacetimes.

Communications in Mathematical Physics , 257(1):43–50, 2005.[8] M. V. Berry. Causal wave propagation for relativistic massive particles: physicalasymptotics in action.

European Journal of Physics , 33(2):279, 2012.[9] F. Besnard. A noncommutative view on topology and order.

Journal of Geometryand Physics , 59(7):861–875, 2009.[10] D. Buchholz and K. Fredenhagen. Locality and the structure of particle states.

Com-munications in Mathematical Physics , 84(1):1–54, 1982.[11] F. Buscemi and G. Compagno. Non-locality and causal evolution in QFT.

Journal ofPhysics B: Atomic, Molecular and Optical Physics , 39(15):695–709, 2006.[12] P. T. Chru´sciel, J. D. E. Grant, and E. Minguzzi. On diﬀerentiability of volume timefunctions. arXiv preprint gr-qc/1301.2909 , 2013.[13] S. Doplicher, K. Fredenhagen, and J. E. Roberts. Spacetime quantization induced byclassical gravity.

Physics Letters B , 331(1–2):39 – 44, 1994.[14] S. Doplicher, K. Fredenhagen, and J. E. Roberts. The quantum structure of spacetimeat the planck scale and quantum ﬁelds.

Communications in Mathematical Physics ,172(1):187–220, 1995.[15] P. Eberhard and R. Ross. Quantum ﬁeld theory cannot provide faster-than-lightcommunication.

Foundations of Physics Letters , 2(2):127–149, 1989.[16] L. L. Foldy and S. A. Wouthuysen. On the Dirac theory of spin 1/2 particles and itsnon-relativistic limit.

Physical Review , 78(1):29, 1950.4017] N. Franco and M. Eckstein. An algebraic formulation of causality for noncommutativegeometry.

Classical and Quantum Gravity , 30(13):135007, 2013.[18] N. Franco and M. Eckstein. Exploring the causal structures of almost commutativegeometries.

Symmetry, Integrability and Geometry: Methods and Applications , 10:010,2014. Special Issue on Noncommutative Geometry and Quantum Groups in honor ofMarc A. Rieﬀel.[19] N. Franco and M. Eckstein. Causality in noncommutative two-sheeted space-times.

Journal of Geometry and Physics , 96:42 – 58, 2015.[20] D. J. H. Garling.

Inequalities: A Journey into Linear Analysis . Cambridge UniversityPress, 2007. Cambridge Books Online.[21] M. Gell-Mann, M. L. Goldberger, and W. E. Thirring. Use of causality conditions inquantum theory.

Physical Review , 95:1612–1627, 1954.[22] R. Haag.

Local Quantum Physics: Fields, Particles, Algebras . Theoretical and Math-ematical Physics. Springer Berlin Heidelberg, 1996.[23] S. W. Hawking. Chronology protection conjecture.

Physical Review D , 46:603–611,1992.[24] G. C. Hegerfeldt. Remark on causality and particle localization.

Physical Review D ,10:3320–3321, 1974.[25] G. C. Hegerfeldt. Violation of causality in relativistic quantum theory?

PhysicalReview Letter , 54:2395–2398, 1985.[26] G. C. Hegerfeldt. Causality, particle localization and positivity of the energy. InA. Bohm, H.-D. Doebner, and P. Kielanowski, editors,

Irreversibility and causalitysemigroups and rigged Hilbert spaces , volume 504 of

Lecture Notes in Physics , pages238–245. Springer Berlin Heidelberg, 1998.[27] G. C. Hegerfeldt. Particle localization and the notion of Einstein causality. InA. Horzela and E. Kapu´scik, editors,

Extensions of Quantum Theory , pages 9–16.Apeiron, Montreal, 2001.[28] G. C. Hegerfeldt and S. N. M. Ruijsenaars. Remarks on causality, localization, andspreading of wave packets.

Physical Review D , 22:377–384, 1980.[29] P. Krekora, Q. Su, and R. Grobe. Relativistic electron localization and the lack ofZitterbewegung.

Physical review letters , 93(4):043004, 2004.[30] E. Minguzzi. Time functions as utilities.

Communications in Mathematical Physics ,298(3):855–868, 2010.[31] E. Minguzzi and M. S´anchez. The causal hierarchy of spacetimes. In D. V. Alekseevskyand H. Baum, editors,

Recent developments in pseudo-Riemannian geometry, ESILectures in Mathematics and Physics , pages 299–358. European Mathematical SocietyPublishing House, 2008.[32] V. Moretti. Aspects of noncommutative Lorentzian geometry for globally hyperbolicspacetimes.

Reviews in Mathematical Physics , 15(10):1171 – 1217, 2003.4133] B. O’Neill.

Semi-Riemannian Geometry with Applications to Relativity . AcademicPress, 1983.[34] M. Paw lowski, T. Paterek, D. Kaszlikowski, V. Scarani, A. Winter, and M. ˙Zukowski.Information causality as a physical principle.

Nature , 461(7267):1101–1104, 2009.[35] R. Penrose.

Techniques of Diﬀerential Topology in Relativity , volume 7 of

CBMS–NSFRegional Conference Series in Applied Mathematics . SIAM, 1972.[36] A. Peres and D. R. Terno. Quantum information and relativity theory.

Reviews ofModern Physics , 76:93–123, Jan 2004.[37] H. Ringstr¨om.

The Cauchy Problem in General Relativity . ESI Lectures in Mathe-matics and Physics. European Mathematical Society, 2009.[38] W. Rudin.

Real and Complex Analysis . McGraw-Hill Book Co., New York, thirdedition, 1987.[39] S. M. Srivastava.

A Course on Borel Sets , volume 180 of

Graduate Texts in Mathe-matics . Springer, 2008.[40] R. F. Streater and A. S. Wightman.

PCT, Spin and Statistics, and All That . PrincetonLandmarks in Mathematics and Physics. Princeton University Press, 2000.[41] B. Thaller.

The Dirac Equation , volume 31 of

Theoretical and Mathematical Physics .Springer-Verlag Berlin, 1992.[42] W. D. van Suijlekom.

Noncommutative Geometry and Particle Physics . MathematicalPhysics Studies. Springer, 2015.[43] C. Villani.

Topics in Optimal Transportation . Graduate Studies in Mathematics.American Mathematical Society, cop., Providence (R.I.), 2003.[44] C. Villani.

Optimal Transport: Old and New , volume 338 of

Grundlehren der mathe-matischen Wissenschaften . Springer–Verlag Berlin Heidelberg, 2008.[45] R. Wagner, B. Shields, M. Ware, Q. Su, and R. Grobe. Causality and relativisticlocalization in one-dimensional Hamiltonians.

Physical Review A , 83:062106, 2011.[46] S. Willard.

General Topology . Addison-Wesley, Reading, MA, 1970.[47] H. G. Winful. Tunneling time, the Hartman eﬀect, and superluminality: A proposedresolution of an old paradox.