[PDF] On the Continuity of the Feasible Set Mapping in Optimal Transport

Abstract

Consider the set of probability measures with given marginal distributions on the product of two complete, separable metric spaces, seen as a correspondence when the marginal distributions vary. In problems of optimal transport, continuity of this correspondence from marginal to joint distributions is often desired, in light of Berge's Maximum Theorem, to establish continuity of the value function in the marginal distributions, as well as stability of the set of optimal transport plans. Bergin (1999) established the continuity of this correspondence, and in this note, we present a novel and considerably shorter proof of this important result. We then examine an application to an assignment game (transferable utility matching problem) with unknown type distributions.

Full PDF

aa r X i v : . [ q -f i n . R M ] S e p On the Continuity of theFeasible Set Mapping in Optimal Transport

Mario Ghossoub ∗ David Saunders † September 29, 2020

Abstract

Consider the set of probability measures with given marginal distributions on the productof two complete, separable metric spaces, seen as a correspondence when the marginal distribu-tions vary. In problems of optimal transport, continuity of this correspondence from marginalto joint distributions is often desired, in light of Berge’s Maximum Theorem, to establish con-tinuity of the value function in the marginal distributions, as well as stability of the set ofoptimal transport plans. Bergin [1999] established the continuity of this correspondence, andin this note, we present a novel and considerably shorter proof of this important result. Wethen examine an application to an assignment game (transferable utility matching problem)with unknown type distributions.

Keywords:

Optimal transport; Measures on product spaces with ﬁxed marginals; Continuity of correspon-dences on spaces of measures; Matching with transferable utility; Assignment game; Hedonic pricing.

JEL Codes:

C60, C61.

Optimization problems over sets of probability measures with given marginals, and optimal trans-port problems in particular, arise in several contexts in economics (see, e.g., Galichon [2016] for abook-length treatment, and the two special issues in volumes 42(2) and 67(2) of

Economic Theory ).Such ubiquitous problems can be formulated assup π ∈ Π X , Y ( µ,ν ) Z X ×Y

Φ ( x, y ) dπ ( x, y ) , (1.1)where Π X , Y ( µ, ν ) denotes the set of all probability measures on a product space X × Y withgiven marginal distributions µ on X and ν on Y (called the set of couplings of µ and ν ), andΦ : X × Y → R is a given function. ∗ Department of Statistics and Actuarial Science, University of Waterloo, [email protected]. † Corresponding author. Department of Statistics and Actuarial Science, University of Waterloo,[email protected]. X and Y are two Polish (i.e., complete, separable, metric) spaces, with respectiveBorel σ -algebras B X and B Y . For a Polish space S , P ( S ) is the set of all Borel probability measureson S . Given µ ∈ P ( X ), ν ∈ P ( Y ), it follows thatΠ X , Y ( µ, ν ) = n π ∈ P ( X × Y ) : π ( A × Y ) = µ ( A ) , π ( X × B ) = ν ( B ) , ∀ ( A, B ) ∈ B X × B Y o . For sequences { π n } n ⊂ P ( S ), π n → π denotes convergence in the narrow topology on P ( S )(i.e., R f dπ n → R f dπ for all f ∈ C b ( S ), the space of bounded continuous functions from S to R ), which we note is metrizable by the Prokhorov metric (e.g., Billingsley [1999, Theorem 6.8]) on P ( S ) deﬁned by d P ( P, Q ) := inf n ε > P ( A ) ≤ Q ( A ε ) + ε, ∀ A ∈ B S o , (1.2)where B S denotes the Borel σ -algebra on S , and for each A ∈ B S , A ε := n y ∈ S : d S ( x, y ) < ε, for some x ∈ A o . Then for each ( µ, ν ) ∈ P ( X ) × P ( Y ), Π X , Y ( µ, ν ) is nonempty, convex, and compact in the narrowtopology on P ( X × Y ) (e.g., Villani [2003, pp. 32, 49-50]).Problem (1.1) is precisely the Monge-Kantorovich optimal transport problem. Here, we areinterested in the properties of the correspondence Π X , Y : P ( X ) × P ( Y ) ։ P ( X × Y ). Formally,Π X , Y associates to each pair ( µ, ν ) ∈ P ( X ) × P ( Y ) of marginal distributions the feasibility setΠ X , Y ( µ, ν ) of Problem (1.1). Let Gr (Π X , Y ) denote the graph of the correspondence Π X , Y , givenby Gr (Π X , Y ) := n (( µ, ν ) , π ) ∈ ( P ( X ) × P ( Y )) × P ( X × Y ) : π ∈ Π X , Y ( µ, ν ) o . We deﬁne the linear functional Ψ : Gr (Π X , Y ) → R byΨ (( µ, ν ) , π ) := Z X ×Y

Φ ( x, y ) dπ ( x, y ) . (1.3)Furthermore, we deﬁne the value function V : X × Y → R for Problem (1.1) by V ( µ, ν ) := sup π ∈ Π X , Y ( µ,ν ) Ψ (( µ, ν ) , π ) . (1.4)Finally, we deﬁne the correspondence M : P ( X ) × P ( Y ) ։ P ( X × Y ), which assigns to each givenpair of marginal distributions ( µ, ν ) the set of optimizers of Problem (1.1), by M := n π ∗ ∈ Π X , Y ( µ, ν ) : Ψ (( µ, ν ) , π ∗ ) = V ( µ, ν ) o . (1.5)Note that, by the Monge-Katorovich Duality Theorem (e.g., Villani [2003, Theorem 1.3]), nonempti-ness of M follows from the upper-semicontinuity of the function Φ, as long as there are lower-semicontinuous functions a ∈ L ( X , B X , µ ) and b ∈ L ( Y , B Y , ν ) such that Φ ( x, y ) ≤ a ( x ) + b ( y ),for µ -a.e. x and ν -a.e. y .One is typically interested in properties of the correspondences Π X , Y and M , as well as conti-nuity of the value function V , which is important when approximating Problem (1.1) in practice.2oreover, while it is immediate to see that Π X , Y has nonempty, convex, and compact values inthe narrow topology on P ( X × Y ), the continuity of Π X , Y is of primary concern, in light of Berge’sMaximum Theorem. Indeed, since Π X , Y has nonempty compact values, and since P ( X × Y ) isHausdorﬀ, being metrizable, continuity of the value function of Problem (1.1) and upper hemicon-tinuity of the correspondence M would follow from continuity of the correspondence Π X , Y , undermild regularity conditions on the function Φ.Bergin [1999] and Savchenko and Zarichnyi [2014] provided proofs of the continuity of thefeasible set correspondence Π X , Y based on rather lengthy arguments. In this paper, we present inSection 2 an alternative, much shorter proof of this important result, using well-known measuretheoretic tools. We then examine in Section 3 an application to a canonical matching problemwith transferable utility. Π X , Y We will make use of the following two results. The ﬁrst can be found in Ethier and Kurtz [2005,Theorem 3.1.2]), and it provides a useful alternative characterization of the metrizability of narrowconvergence. The second can be found in Villani [2003, pp. 208-210]), and it is often referred toas the Gluing Lemma.

Lemma 2.1.

Let ( S , d S ) be a Polish space with Borel σ -algebra B S and let d P denote the Prokhorovmetric on P ( S ) , deﬁned in eq. (1.2) . Then d P ( P, Q ) = inf m ∈ Π S , S ( µ,ν ) inf n ε > m [( x, y ) : d S ( x, y ) ≥ ε ] ≤ ε o . (2.1) Lemma 2.2 (Gluing Lemma) . Let v , v , v be three probability measures supported in Polishspaces S , S , S respectively, and let m ∈ Π S , S ( v , v ) and m ∈ Π S , S ( v , v ) be twotransference plans. Then there exists a probability measure m ∈ P ( S × S × S ) with marginals m on S × S and m on S × S . That is, if B S ×S and B S ×S denote the Borel σ -algebras of S × S and S × S , respectively, then m ( A × S ) = m ( A ) , m ( S × B ) = m ( B ) , ∀ ( A, B ) ∈ B S ×S × B S ×S . Theorem 2.3.

The correspondence Π X , Y : P ( X ) × P ( Y ) ։ P ( X × Y ) is continuous, and hasnonempty, convex, and compact values in the narrow topology on P ( X × Y ) .Proof. First, note that for every ( µ, ν ) ∈ P ( X ) × P ( Y ), Π X , Y ( µ, ν ) = ∅ , since the tensor product µ ⊗ ν belongs to Π X , Y ( µ, ν ). Moreover, Π X , Y trivially has convex values. Compactness of thevalues of Π X , Y in the narrow topology on P ( X × Y ) is shown in Villani [2003, pp. 49-50], forinstance. We now show continuity of Π X , Y .To show upper hemicontinuity, suppose that we have { (( µ n , ν n ) , π n ) } n ⊂ Gr (Π X , Y ), with µ n → µ and ν n → ν . Hence, { µ n } n and { ν n } n and tight, by Prokhorov’s Theorem (e.g., Billingsley [1999,3heorems 5.1-5.2]). Tightness of { µ n } n and { ν n } n implies that of { π n } n , so that by Prokhorov’sTheorem there exists a convergent subsequence π n k → π . For any ( f, g ) ∈ C b ( X ) × C b ( Y ), we have Z f dπ = lim k →∞ Z f dπ n k = lim k →∞ Z f dµ n k = Z f dµ ; Z g dπ = lim k →∞ Z g dπ n k = lim k →∞ Z g dν n k = Z g dν. Therefore, Z X ×Y [ f + g ] dπ = Z X f dµ + Z Y g dν, and since X , Y are Polish spaces, it follows from Villani [2003, p. 18] that π ∈ Π X , Y ( µ, ν ).To show lower hemicontinuity, ﬁx π ∈ Π X , Y ( µ, ν ) and suppose that we have { ( µ n , ν n ) } n ⊂P ( X ) × P ( Y ), with µ n → µ and ν n → ν . Since µ n → µ , it follows that d P ( µ n , µ ) →

0, where d P denotes the Prokhorov metric, characterized in Lemma 2.1. Fix n , let 0 < ε n ≤ d P ( µ n , µ ) + n , andlet v ,n ∈ Π X , X ( µ n , µ ) be such that v ,n [( x, x ′ ) : d X ( x, x ′ ) ≥ ε n ] ≤ ε n . Applying Lemma 2.2 with S i = X for i = 1 , S = Y , π = v ,n , and π = π , we obtain a measure m ,n on X × X × Y with the required “bivariate” marginal distributions.Similarly, let 0 < δ n ≤ d P ( ν, ν n )+ n and v ,n ∈ Π Y , Y ( ν, ν n ) be such that v ,n [( y, y ′ ) : d Y ( y, y ′ ) ≥ δ n ] ≤ δ n . Apply Lemma 2.2 again with S = X × X , S = S = Y , π = m ,n , and π = v ,n toobtain a measure m n on X × X × Y × Y with “univariate” marginal distributions µ n , µ, ν, ν n and“bivariate” marginal distributions v ,n for the ﬁrst and second components, π for the second andthird components, and v ,n for the third and fourth components.Let π n denote the “bivariate” marginal distribution (on X × Y ) of the ﬁrst and fourth com-ponents (so that π n ∈ Π X , Y ( µ n , ν n )), and consider the measure e m n on ( X × Y ) × ( X × Y ) withmarginals ( π n , π ) that is the image of m n under the mapping σ ( x , x , y , y ) = ( x , y , x , y ).Metrize the product space using d X ×Y (( x, y ) , ( x ′ , y ′ )) := max( d X ( x, x ′ ) , d Y ( y, y ′ )) , so that n d X ×Y (( x, y ) , ( x ′ , y ′ )) ≥ c o = ⇒ n d X ( x, x ′ ) ≥ c or d Y ( y, y ′ ) ≥ c o . Then e m [ d X ×Y (( x, y ) , ( x ′ , y ′ )) ≥ ε n + δ n ] ≤ e m [ d X ( x, x ′ ) ≥ ε n + δ n ] + e m [ d Y ( y, y ′ ) ≥ ε n + δ n ] ≤ e m [ d X ( x, x ′ ) ≥ ε n ] + e m [ d Y ( y, y ′ ) ≥ δ n ] ≤ ε n + δ n . Thus d P ( π n , π ) ≤ ε n + δ n , π n ∈ Π X , Y ( µ n , ν n ), and π n → π . We consider a canonical example of a matching problem with transferable utility, or assignmentgame (Shapley and Shubik [1971]), in which a central planner seeks to assign an element x froma population X to an element y from a population Y . Both X and Y can be multidimensional,and we take them to be generic nonempty Polish spaces. The spaces X and Y are equipped with4orel probability measures µ and ν , respectively, representing the distribution of agents’ typesover the respective spaces. Let Φ : X × Y → R denote the joint utility (or surplus) function,whereby Φ ( x, y ) is the joint surplus generated if x ∈ X is matched with y ∈ Y . For instance, µ can denote the distribution of skills over a set X for a population of workers, ν the distributionof ﬁrm characteristics over a set Y , and Φ ( x, y ) denotes the value created if a worker with skill x ∈ X is employed by a ﬁrm with characteristic y ∈ Y .Following Chiappori et al. [2010], an assignment of x ∈ X to y ∈ Y is a probability measure π ∈ Π X , Y ( µ, ν ) with support supp ( π ) ⊂ X × Y , which leads to an economic value, or total surplusof Z X ×Y

Φ ( x, y ) dπ ( x, y ) . A payoﬀ corresponding to an assignment π ∈ Π X , Y ( µ, ν ) is a pair of functions ( U X , U Y ) ∈ L ( X , B X , µ ) × L ( Y , B Y , ν ) such that U X ( x ) + U Y ( y ) = Φ ( x, y ) , for π -a.e. ( x, y ) ∈ supp ( π ) . An outcome is a triple ( π, U X , U Y ), where ( U X , U Y ) is a payoﬀ corresponding to π . The standardequilibrium concept used in this framework is satibility . An outcome ( π, U X , U Y ) is called stable ifit satisﬁes U X ( x ) + U Y ( y ) ≥ Φ ( x, y ) , ∀ ( x, y ) ∈ X × Y . Finally, a matching π is stable if there exists a payoﬀ ( U X , U Y ) corresponding to π , such that theoutcome ( π, U X , U Y ) is stable. Hence, stability is tantamount to robustness against deviations byboth individuals and pairs. In other words stability requires that (i) no matched agent is betteroﬀ unmatched; and (ii) no two unmatched agents are better oﬀ matched together than remainingin their current situation.A fundamental result in the theory of matching with transferable utility is that stability isequivalent to surplus maximization. This result is due to Shapley and Shubik [1971] in the discretecase and Gretsky et al. [1992] in the continuous case (and it is also a consequence of the Monge-Kantorovich duality Villani [2008, Theorem 5.10]). It was recently extended by Pass [2019] to asetting of tripartite matching (also known as multi-marginal optimal transport). Proposition 3.1.

For a given surplus function

Φ :

X × Y → R , a matching π ∈ Π X , Y ( µ, ν ) isstable if and only if it solves the surplus maximization problem (1.1) : sup π ∈ Π X , Y ( µ,ν ) Z X ×Y

Φ ( x, y ) dπ ( x, y ) . A central planner can hence implement a stable, that is, equilibrium assignment by solving thesurplus maximization problem (1.1). This, however, necessitates knowledge of the marginal (type)distributions µ and ν . If the type distributions µ and ν are unknown by the central planner, thensince X and Y are separable, an approximation based on sampling from empirical distributions canbe used, as long as the value of Problem (1.1) is continuous. This, in turn, can be obtained fromBerge’s Maximum Theorem when the correspondence Π X , Y is continuous, under some regularityconditions on the surplus function Φ. We summarize this in Proposition 3.2 below. First, however,we introduce some needed notation. 5or the probability space ( X , B X , µ ), there are X -valued independent random variables { X i } i ≥ deﬁned on a common probability space (Ω X , F X , P X ), with laws L ( X i ) = P X ◦ X − i = µ , forall i ≥ § § Y , B Y , ν ),there are Y -valued independent random variables { Y j } j ≥ deﬁned on a common probability space(Ω Y , F Y , P Y ), with laws L ( Y j ) = P Y ◦ Y − j = ν , for all j ≥

1. Deﬁne the empirical measures by µ n ( A )( ω ) := 1 n n X i =1 A ( X i ( ω )) , ∀ A ∈ B X , ∀ ω ∈ Ω X ; (3.1)and ν n ( B )( κ ) := 1 n n X j =1 B ( Y j ( κ )) , ∀ B ∈ B Y , ∀ κ ∈ Ω Y . (3.2) Proposition 3.2.

Let V : X × Y → R be the value function of Problem (1.1) deﬁned in eq. (1.4) , and let { µ n } n and { ν n } n be the empricical measures deﬁned in eq. (3.1) and (3.2) . If Φ ∈ C b ( X × Y ) , then there exists a stable matching π ∗ . Moreover, V ( µ n , ν n ) → V ( µ, ν ) = Z X ×Y

Φ ( x, y ) dπ ∗ ( x, y ) = sup π ∈ Π X , Y ( µ,ν ) Ψ (( µ, ν ) , π ) , where Ψ denotes the objective function of Problem 1.1 deﬁned in eq. (1.3) .Proof. First, note that by the Monge-Katorovich Duality Theorem (e.g., Villani [2003, Theorem1.3]), the assumption that Φ ∈ C b ( X × Y ) guarantees the existence of a solution π ∗ to Problem(1.1). Hence, by Proposition 3.1, π ∗ is a stable matching. By Theorem 2.3, Π X , Y is continuous,implying that Gr (Π X , Y ) is closed. Since Φ ∈ C b ( X × Y ), it follows that the objective function Ψof Problem 1.1 is continuous in the product topology. Since Π X , Y is continuous and has nonempty,compact values, and since P ( X × Y ) is Hausdorﬀ being metrizable, continuity of the value function V of Problem (1.1) follows from Berge’s Maximum Theorem (e.g., Aliprantis and Border [2006,Theorem 17.31]). By Varadarajan’s extension (Dudley [2002, Theorem 11.4.1]) of the classicalGlivenko-Cantelli Theorem, the sequences { µ n } n and { ν n } n converge almost surely to µ and ν ,respectively, since the spaces X and Y are separable. Therefore, µ n → µ and ν n → ν . Hence, bycontinuity of V , it follows that V ( µ n , ν n ) → V ( µ, ν ) . Remark 3.3.

In light of the Monge-Katorovich Duality Theorem, the assumption in Proposition3.2 that Φ ∈ C b ( X × Y ) can be weakened to an assumption that Φ is upper-semicontinuous andthat there are some lower-semicontinuous functions a ∈ L ( X , B X , µ ) and b ∈ L ( Y , B Y , ν ) suchthat Φ ( x, y ) ≤ a ( x ) + b ( y ) , for µ -a.e. x and ν -a.e. y. Remark 3.4 (Hedonic Price Equilibria) . Chiappori et al. [2010] show that there exists a canonicalcorrespondence between models of hedonic pricing with quasi-linear preferences and TU matchingmodels, and hence a fortiori surplus maximization problems (in light of Proposition 3.1). This wasextended by Pass [2019] to a setting of multi-marginal optimal transport (tripartite matching). Werefer to Ekeland [2005], Ekeland [2010], Chiappori et al. [2010], and Pass [2019] for more about odels of hedonic equilibria and their equivalence to surplus maximization problems. Proposition3.2 above can therefore be used to show the existence of a hedonic price equilibrium, when the typedistributions of buyers and sellers (the probability measures µ and ν ) are unknown. References

C.D. Aliprantis and K.C. Border.

Inﬁnite Dimensional Analysis: A Hitchhiker’s Guide . Springer,Berlin, third edition, 2006.J. Bergin. On the continuity of correspondences on sets of measures with restricted marginals.

Economic Theory , 13:471–481, 1999.P. Billingsley.

Convergence of Probability Measures . John Wiley & Sons, New York, second edition,1999.P-A. Chiappori, R.J. McCann, and L.P. Nesheim. Hedonic price equilibria, stable matching, andoptimal transport: equivalence, topology, and uniqueness.

Economic Theory , 42(2):317–354,2010.R.M. Dudley.

Real Analysis and Probability . Cambridge University Press, Cambridge, secondedition, 2002.I. Ekeland. An optimal matching problem.

ESAIM: Control, Optimisation and Calculus of Vari-ations , 11(1):57–71, 2005.I. Ekeland. Existence, uniqueness and eﬃciency of equilibrium in hedonic markets with multidi-mensional types.

Economic Theory , 42(2):275–315, 2010.S.N. Ethier and T.G. Kurtz.

Markov Processes: Characterization and Convergence . John Wiley& Sons, Hoboken, New Jersey, second edition, 2005.A. Galichon.

Optimal Transport Methods in Economics . Princeton University Press, Princeton,2016.N.E. Gretsky, J.M. Ostroy, and W.R. Zame. The nonatomic assignment model.

Economic Theory ,2(1):103–127, 1992.B. Pass. Interpolating between matching and hedonic pricing models.

Economic Theory , 67(2):393–419, 2019.A. Savchenko and M. Zarichnyi. Correspondences of probability measures with restrictedmarginals.

Proc. Intern. Geom. Center , 7(4):34–39, 2014.L.S. Shapley and M. Shubik. The assignment game i : The core. International Journal of GameTheory , 1(1):111–130, 1971.C. Villani.

Topics in Optimal Transportation . American Mathematical Society, Providence, 2003.C. Villani.