[PDF] A survey of some recent applications of optimal transport methods to econometrics

Abstract

This paper surveys recent applications of methods from the theory of optimal transport to econometric problems.

Full PDF

aa r X i v : . [ ec on . GN ] F e b A SURVEY OF SOME RECENT APPLICATIONS OF OPTIMALTRANSPORT METHODS TO ECONOMETRICS

ALFRED GALICHON

Abstract.

This paper surveys recent applications of methods from the theory of optimaltransport to econometric problems.

Keywords : optimal transport, matching, quantile methods, discrete choice, convex analysis.

JEL Classiﬁcation : C01, C02. Introduction

Optimal transport, popularized by Villani’s texts Villani, 2003 and Villani, 2009, is cur-rently a very active research area of mathematics, and it has found applications in manysciences. Economics is no exception. However, up to a recent period, the appearance ofoptimal transport in economics was only in connection with two-sided models of matching(see e.g. Becker, 1973, Shapley and Shubik, 1972, Gretsky et al., 1992): indeed, as shown byChiappori et al. Chiappori et al., 2010, equilibrium outcomes in two-sided matching prob-lems with transferable utility coincide with the solutions of an optimal transport problem.More recently however, methods from optimal transport theory have been used as a tool ina number of problems in econometrics, microeconomic theory, and ﬁnance. These methodsare exposed in a comprehensive way in my recent monograph,

Optimal Transport Methodsin Economics

Galichon, 2016, aimed at an audience of economists. The goal of the presentpaper, which partly follows the presentation there, is to provide a short introduction to the

Date : September, 2016. Galichon’s research has received funding from the European Research Councilunder the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement313699. use of optimal transport methods in econometrics. The reader will hopefully forgive me fora bias toward my own research in selecting applications.The paper is organized as follows. Section 2 provides a brief overview of the theory, andstates its main result, the Monge-Kantorovich theorem. Section 3 discusses three particularcases where, because of additional restrictions, the analysis of solutions of the Monge-Kantorovich problem can be pushed further, which is most helpful in applications. Section4 reviews a number of econometric applications involving optimal transport methods.2.

Monge-Kantorovich theory in a nutshell

Optimal coupling.

We start by describing the optimal transport problem at an in-termediate level of generality, which will suﬃce for our applications. Let X and Y be twoclosed subsets of R d and R d ′ , respectively, and consider two (Borel) probability distributions P and Q of respective supports X and Y . A coupling of probabilities P and Q is a jointprobability distribution π on X × Y with marginal distributions P and Q , which means thatif ( X, Y ) is a random vector with probability distribution π , then its projections X and Y on X and Y should be random vectors with respective probability distributions P and Q .The set of such couplings will be denoted as M ( P, Q ) = { π : ( X, Y ) ∼ π implies X ∼ P and Y ∼ Q } . Hence, π ∈ M ( P, Q ) encodes the “missing information” needed in order to build arandom vector (

X, Y ) when X ∼ P and Y ∼ Q . When X = Y = [0 , P = Q = U ([0 , M ( P, Q ) coincides with the set of copulas , which are well-known objects in appliedprobability whose purpose is also to build bivariate distributions based on univariate ones.Therefore, couplings can be seen as a generalization of copulas beyond the univariate case.Following Monge Monge, 1781 and Kantorovich Kantorovich, 1939, 1948, we shall con-sider the

Monge-Kantorovich problem of ﬁnding the “optimal” coupling of probability dis-tributions P and Q . By optimal, we mean here that it should maximize the expectation of PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 3 some surplus function Φ :

X × Y → R , that is,sup π ∈M ( P,Q ) E π [Φ ( X, Y )] . (2.1)We can interpret this problem as a problem of worker-ﬁrm assignment: a central plannerneeds to assign on a one-to-one basis a population of workers, whose skills are distributedon set X with distribution P , to a population of ﬁrms whose characteristics are distributedaccording to Q on set Y . The economic value created by worker x if employed by ﬁrm y is Φ ( x, y ). An assignment of workers to ﬁrms is deﬁned as π ∈ M ( P, Q ), which measuresthe distribution of the matches. The total economic value created by an assignment π istherefore E π [Φ ( X, Y )], and hence, the maximum economic value that the central plannercan hope to achieve is (2.1).2.2.

Duality and the Monge-Kantorovich theorem.

Problem (2.1) is an inﬁnite-dimensional linear programming problem. Indeed, the objective function is linear becauseit is an expectation with respect to measure π , which is the optimization variable, and theconstraint π ∈ M ( P, Q ) can be expressed as E π [ ϕ ( X ) + ψ ( Y )] = E P [ ϕ ( X )] + E Q [ ψ ( Y )]for all integrable test functions ϕ : X → R and ψ : Y → R , and these constraints are linearwith respect to π .As a result, problem (2.1) has a dual formulation. The dual formulation can be workedout by hand, by noticing that if ϕ and ψ are test functions such that Φ ( x, y ) ≤ ϕ ( x ) + ψ ( y )for all x and y in their domains, then by integration with respect to any π ∈ M ( P, Q ) andby using the fact that π has margins P and Q , it follows that E π [Φ ( X, Y )] ≤ E P [ ϕ ( X )] + E Q [ ψ ( Y )] . This being true for any π ∈ M ( P, Q ) and any test functions ϕ and ψ such that Φ ( x, y ) ≤ ϕ ( x ) + ψ ( y ), it follows by taking the maximum on the left and the minimum on the righthand side that the value of problem (2.1) is less or equal to the value of the dual problem inf ϕ ( x )+ ψ ( y ) ≥ Φ( x,y ) E P [ ϕ ( X )] + E Q [ ψ ( Y )] (2.2)where the minimum is over all P - and Q - integrable functions ϕ and ψ such that ϕ ( x ) + ψ ( y ) ≥ Φ ( x, y ). Under a few additional assumptions on P , Q and Φ, the Monge-Kantorovich ALFRED GALICHON theorem asserts that this inequality actually holds as an equality, and the optimal coupling π exists. Under an additional assumption, pairs of solution potentials ( ϕ, ψ ) to the dualproblem (2.2) also exist. This is made precise in the following result: Theorem 1 (Monge-Kantorovich duality theorem) . Assume

Φ :

X × Y → R ∪ {−∞} is anupper semicontinuous surplus function bounded from above by a ( x ) + b ( y ) where a and b are respectively integrable with respect to P and Q . Then:(i) Strong duality holds, that is: sup π ∈M ( P,Q ) E π [Φ ( X, Y )] = inf ϕ ( x )+ ψ ( y ) ≥ Φ( x,y ) E P [ ϕ ( X )] + E Q [ ψ ( Y )] , (2.3) where the inﬁmum on the right hand-side is taken over measurable and integrable functions ϕ and ψ , and the inequality constraint should be satisﬁed for P − almost every x and Q − almostevery y .(ii) An optimal solution π to the primal problem on the left hand-side exists.(iii) Assume further Φ is bounded from below by a ( x )+ b ( y ) where a and b are respectivelyintegrable with respect to P and Q . Then the dual problem on the right hand-side also hassolutions. For a proof of this result, refer to Villani, 2009, Theorem 5.10.Let us now explore a criterion for jointly checking that a coupling π and a pair of potentialfunctions ( ϕ, ψ ) are simultaneously optimal for the primal and the dual respectively. Take π ∈ M ( P, Q ) and take ϕ and ψ such that ϕ ( x ) + ψ ( y ) ≥ Φ ( x, y ). Then π and ( ϕ, ψ )are respectively solutions to (2.1) and (2.2) if and only if the equality E π [ ϕ ( X ) + ψ ( Y )] = E π [Φ ( X, Y )] holds. This is equivalent to the fact that

Supp ( π ) ⊆ { ( x, y ) ∈ X × Y : ϕ ( x ) + ψ ( y ) = Φ ( x, y ) } . (2.4)In the worker-ﬁrm interpretation alluded to above, this condition has an interpretationin terms of pairwise stability . Indeed, it implies that ϕ ( x ) can be interpreted as the payoﬀof a worker of type x , and ψ ( y ) can be interpreted as the payoﬀ of a ﬁrm of type y . If x and y are matched at equilibrium, then ( x, y ) ∈ Supp ( π ), thus ϕ ( x ) + ψ ( y ) = Φ ( x, y ), PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 5 that is, the output Φ ( x, y ) created by the ( x, y ) pair should be divided into the payoﬀof the worker ϕ ( x ), and the payoﬀ of the ﬁrm ψ ( y ). If x and y are not matched, then ϕ ( x ) + ψ ( y ) ≥ Φ ( x, y ), which expresses the fact that x and y do not have an incentiveto leave their existing partners to form a blocking pair. Thus ( ϕ, ψ ) is a pair of “stable”payoﬀs.2.3. Some remarks.

Let us note a few immediate properties of this problem.First, note that if one replaces Φ ( x, y ) by Φ ( x, y ) + a ( x ) + b ( y ), then the set of optimalcouplings π for problem (2.1) remains unchanged: indeed, the value of E π [ a ( X ) + b ( Y )]does not depend on the choice of π ∈ M ( P, Q ). Also note that if ( ϕ, ψ ) is a solution toproblem (2.2), then for any real number c , ( ϕ − c, ψ + c ) is also a solution.Next, note that if a minimizer ( ϕ, ψ ) of the dual problem (2.2) exists, then one hasnecessarily ϕ ( x ) = max y ∈Y { Φ ( x, y ) − ψ ( y ) } (2.5) ψ ( y ) = max x ∈X { Φ ( x, y ) − ϕ ( x ) } , (2.6)which has another interpretation, this time in terms of Walrasian equilibrium: ϕ ( x ) can beseen as the equilibrium wage of a worker of type x , while ψ ( y ) can be interpreted as theequilibrium surplus of a ﬁrm of type y . In particular, formula (2.6) expresses the problemof a ﬁrm y ∈ Y looking for a worker x ∈ X yielding the highest surplus Φ ( x, y ) − ϕ ( x ).Formulas (2.5) and (2.6) allow to rewrite problem (2.2) using a number of alternativereformulations: • Using the expression of ϕ given in (2.5), problem (2.2) rewrites asinf ψ (cid:26) E P (cid:20) max y ∈Y { Φ (

X, y ) − ψ ( y ) } (cid:21) + E Q [ ψ ( Y )] (cid:27) (2.7) • Using the fact that a constant can be freely added to or subtracted from ψ , one maylook at the solutions of problem (2.2) among those such that E Q [ ψ ( Y )] = 0, that ALFRED GALICHON is, problem (2.2) yet rewrites asinf ψ E P (cid:20) max y ∈Y { Φ (

X, y ) − ψ ( y ) } (cid:21) s.t. E Q [ ψ ( Y )] = 0 . These alternative formulations will turn out to be useful below.3.

Three cases of interest

Discrete optimal assignment problem.

One ﬁrst case of interest is found when P and Q are discrete distributions with ﬁnite support: P = P ni =1 p i δ x i and Q = P mj =1 q j δ y j .In which case, we denote Φ ij = Φ ( x i , y j ), and duality (2.3) rewrites asmax π ≥ P ij π ij Φ ij = min ϕ,ψ P i p i ϕ i + P j q j ψ j s.t. P i π ij = p i P j π ij = q j s.t. ϕ i + ψ j ≥ Φ ij (3.1)The complementary slackness condition states that if π ij , which is the Lagrange multiplierassociated to the dual constraint ϕ i + ψ j ≥ Φ ij , is strictly positive, then the correspondingconstraint is saturated. Thus π ij > ϕ i + ψ j ≥ Φ ij , which recovers the generalcondition (2.4) viewed above.Formulated as a linear programming problem as in (3.1), the optimal transport problemcan be solved using standard linear programming toolboxes; see Galichon, 2016, section 3.4how to perform computations eﬃciently using the sparse structure of the constraint matrix.3.2. Continuous-to-discrete case.

One second case of interest arises when P is a contin-uous distribution, but when Q is a discrete distribution with ﬁnite support: Q = P mj =1 q j δ y j .In this case, the reformulation of the Monge-Kantorovich problem using expression (2.7)is particularly useful. Indeed, following Aurenhammer Aurenhammer, 1987, one may thenexpresses problem (2.2) asmin ψ ∈ R m E P (cid:20) max j ∈{ ,...,m } { Φ (

X, y j ) − ψ j } (cid:21) + m X j =1 q j ψ j , (3.2)which is simply the problem of minimizing a convex function in R m . See Galichon, 2016,section 5.3 for a discussion on the implementation of this method using gradient descent. PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 7

Scalar product surplus.

A third case of interest is the case when P and Q arecontinuous distributions on X = Y = R d ( d = d ′ ), and when Φ ( x, y ) = x ⊺ y is the scalarproduct between x and y . In this case, under the assumption that P and Q have ﬁnitesecond moments, a solution ( ϕ, ψ ) to (2.2) exists, and relations (2.5) and (2.6) become  ϕ ( x ) = max y ∈ R d { x ⊺ y − ψ ( y ) } ψ ( y ) = max x ∈ R d { x ⊺ y − ϕ ( x ) } hence ϕ and ψ are related by Legendre-Fenchel conjugation, which is classically denoted ψ = ϕ ∗ and ϕ = ψ ∗ . A short tutorial on convex analysis from the point of view of optimaltransport is provided in Galichon, 2016, section 6.1, where the basic notions used in thesequel, such as Legendre transform and subdiﬀerential, are recapitulated.By (2.4), if π ∈ M ( P, Q ) is an optimal coupling and if ( ϕ, ψ ) is a dual solution, then π and ( ϕ, ψ ) are both optimal if and only if the support of π is included in the set of ( x, y )such that ϕ ( x ) + ϕ ∗ ( y ) = x ⊺ y . But this relation is equivalently reexpressed in convexanalysis by the fact that y ∈ ∂ϕ ( x ), where ∂ϕ denotes the subgradient of ϕ at x . Hence, π and ( ϕ, ψ ) are both optimal if and only if Supp ( π ) ⊆ n ( x, y ) ∈ R d × R d : y ∈ ∂ϕ ( x ) o . However, it is a well-known fact in convex analysis that if ϕ is convex, then the setof points where ϕ is not diﬀerentiable is of zero Lebesgue measure. As a result, if P isabsolutely continuous with respect to the Lebesgue measure, then ∂ϕ ( x ) P -almost surelycoincides with {∇ ϕ ( x ) } . Therefore, in this case, an optimal coupling ( X, Y ) ∼ π can berepresented by ( X, ∇ ϕ ( X )), where ϕ is convex and is such that ∇ ϕ ( X ) ∼ Q . This isdenoted as ∇ ϕ P = Q (3.3)and one says that ∇ ϕ “pushes forward P onto Q .” The existence and uniqueness (up to aconstant) of a convex function ϕ satisfying (3.3) are the object of Brenier’s theorem Brenier,1987 (see also Knott and Smith, 1984 and R¨uschendorf and Rachev, 1990). This theorem It follows from Rademacher’s theorem; see Villani, 2009, theorem 10.8.

ALFRED GALICHON was improved by McCann (McCann, 1995), who obtained the existence of ϕ in (3.3) withoutrequiring P and Q to have second moments.4. Econometric applications

Demand inversion in discrete choice models.

Our ﬁrst application is the problemof demand inversion in discrete choice, or random utility models. Consider a discrete choicemodel, where agents face alternatives y ∈ Y . The systematic utility associated to alternative y is a real number δ y , and the unobserved heterogeneity associated to it is ε y . It is assumedthat ε ∼ P , where P is a probability distribution on R Y .Let q be a probability vector on Y . One says that q is a vector of market shares inducedby systematic utility vector δ if q is the probability distribution of a random variable Y such that Y ∈ arg max y ∈Y { δ y + ε y } . The problem of demand inversion in discrete choicemodels, as deﬁned for instance in Berry, 1994, is as follows: given a vector of market shares q , what is the set of systematic utility vectors D ( q ) whose elements δ induce the choiceprobability q . Formally, D ( q ) = (cid:26) δ ∈ R Y : ∃ Y ∼ q, Y ∈ arg max y ∈Y { δ y + ε y } (cid:27) . The main observation here, due to Galichon and Salani´e, 2014, is that δ ∈ D ( q ) ifand only if ψ = − δ appears in the solution of the dual Monge-Kantorovich problem (2.2).Indeed, deﬁne G ( δ ) = E P (cid:20) max y ∈Y { δ y + ε y } (cid:21) then the envelope theorem shows that δ ∈ D ( q ) if and only if q ∈ ∂G ( δ ). But according toa basic result in convex analysis (e.g. Galichon, 2016, section 6.1), q ∈ ∂G ( δ ) if and only if δ ∈ ∂G ∗ ( q ); thus D ( q ) = ∂G ∗ ( q ), and D ( q ) = arg min δ  G ( δ ) − X y ∈Y q y δ y  PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 9

Thus, δ ∈ D ( q ) if and only if ψ = − δ minimizes G ( − ψ ) + P y ∈Y q y ψ y , that is, if and onlyif it solves min ψ E P (cid:20) max y ∈Y { ε y − ψ y } (cid:21) + X y ∈Y q y ψ y which is exactly problem (3.2) with Φ ( ε, y ) = ε y . Therefore the problem of demand inver-sion in discrete choice models is equivalent to a optimal transport problem, and numericalmethods for solving the latter can be used for the former. See the papers Galichon andSalani´e, 2014, Chiong et al., 2015, and an overview in Galichon, 2016, section 9.2.4.2. Multivariate quantiles.

Quantiles play a fundamental role in econometrics and ap-plied statistics. They are useful for comparing distributional outcomes, measuring riskand inequality, identifying willingness-to-pay, etc. One of the most stringent limitations ofquantiles is the fact that they are fundamentally univariate objects: indeed, the quantilefunction of real-valued random variable Y is deﬁned as the inverse map of its cumulativedistribution function, which can only be inverted when Y is univariate.There is a considerable literature aiming at providing various generalizations of the notionof quantile to the multivariate case, which we will not review here. Our point here is toshow that optimal transport provides a sensible such generalization, see Galichon, 2016,section 6.3.One possible way to deﬁne the (univariate) quantile map is as the monotone map whichpushes forward the uniform distribution on [0 ,

1] to a distribution of interest; in otherwords, the quantile map associated to random variable Y ∼ ν is the map T such that (i) T is nondecreasing and such that if U ∼ U ([0 , T ( U ) has the same distribution ν as Y . This point of view will be the basis of our multivariate generalization of the notionof quantiles to the case of a random vector of dimension d : letting µ be a probabilitydistribution of reference over R d (when d = 1, a natural choice is the uniform distribution),the µ -quantile map associated to ν is the map T : R d → R d such that: • T µ = ν , which means that T pushes forward the distribution µ to ν ; and suchthat • T is “monotone” in the sense that it is the gradient of a convex function: T = ∇ ϕ ,where ϕ : R d → R .The shape restriction that T should be the gradient of a convex function is a naturalgeneralization as, in dimension one, monotone maps are the gradients/derivatives of convexfunctions. That a unique solution to this requirement exists and is unique is the object ofMcCann’s theorem referred above.Let us now discuss how the µ -quantile associated to an empirical distribution is con-structed. Let { y , ..., y n } be a sample, and let ν n = n − P ni =1 δ y i be the associated empiricaldistribution. Because the quantile map is constructed as the gradient of a convex function,it is easy to see that in the case ν has a ﬁnite support, the quantile map should be thegradient of a piecewise aﬃne and convex function – indeed, the gradient of such a map willtake a ﬁnite number of possible values. In this case, the quantile map T n associated to ν n will be deﬁned as T n ( u ) = arg max y i ∈{ y ,...,y n } n u ⊤ y i − ψ i o where the weights ψ i form a solution to problem (3.2) with the surplus Φ chosen as the scalarproduct. Hence, the empirical quantile map can be obtained as a ﬁnite-dimensional opti-mization problem. Consistency of T n , which expresses that the µ -quantile of ν n convergesto the µ -quantile of ν has been shown in Chernozhukov et al., Forthcoming.This deﬁnition of µ -quantiles, which was originally introduced in Ekeland et al., 2011,has a number of applications, including: • A notion of multivariate comonotonicity, which allows to construct multivariatemeasures of ﬁnancial exposures. This was the original motivation in Ekeland et al.,2011, see also Bosc and Galichon, 2014. The deﬁnition is that Y and Y are µ -comonotone if the µ -quantile of Y + Y is the sum of the µ -quantiles of Y and Y . PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 11 • Multivariate counterparts for a number of stochastic orders, which are known torely on the notion of quantiles in the univariate case, such as ﬁrst-order stochasticdominance. See Charpentier et al., 2016. • An extension of the theory of rank-dependent expected utility to multivariate riskyprospects. Rank-dependent utility functions were built as a response to paradoxes inexpected utility theory such as Allais’ paradox. A multivariate extension of Yaari’sutility function was proposed in Galichon and Henry, 2012. • A characterization of Pareto eﬃcient risk-sharing arrangements in the multivariatecase, as was done in Carlier et al., 2012, extending a result by Landsberger andMeilijson, 1994 of the eﬃcient risk-sharing arrangements as the comonotone ones. • An extension of Matzkin’s quantile-based identiﬁcation results in hedonic models(see e.g. Matzkin, 2003 and Heckman et al., 2010) to the case with more than oneattribute, as is done in Chernozhukov et al., 2015. • A multivariate version of quantile regression based on a semiparametric extensionof the Monge-Kantorovich case proposed in Carlier et al., 2016.4.3.

Partial identiﬁcation.

Optimal transport can also be a useful tool to handle partialidentiﬁcation in incomplete models. Consider an economic model with parameter θ ∈ Θwhich predicts that the population’s income X will have distribution P θ . Assume that theincome X is not perfectly observed, but that only the tax bracket in which the incomebelongs is observed. If y is the mid-point of the bracket, let Γ ( y ) = [ l ( y ) , u ( y )] be thecorresponding bracket. The brackets are indexed by their mid-point Y , whose distribution Y ∼ Q is assumed to be observed. The identiﬁed set Θ I is therefore the set of parameters θ ∈ Θ such that the distribution X ∼ P θ predicted by the model is compatible withthe observed distribution of the brackets Y ∈ Q . (We abstract away from any sampleuncertainty here). More precisely, Θ I is the set of θ such that there is a joint probability π ∈ M ( P θ , Q ) such that π ( X ∈ Γ ( Y )) = 1.This compatibility problem can be formulated as an optimal transport problem. Indeed,if Φ ( x, y ) = 1 { x ∈ Γ ( y ) } then θ ∈ Θ I if and only if max π ∈M ( P θ ,Q ) E π [1 { X ∈ Γ ( Y ) } ] = 1 . (4.1)Problem (4.1) is an optimal transport problem, and thus the numerical determinationof Θ I boils down to the computation of the value of such a problem. This equivalence hasbeen put to use in Galichon and Henry, 2011 and Ekeland et al., 2010; see a syntheticpresentation in Galichon, 2016, section 9.1.4.4. Revealed preference.

There is an interesting connection between optimal transportand Afriat’s theorem on revealed preference inequalities Afriat, 1967. Indeed, the problem ofrevealed preference under its most basic form consists in the following: given the observationof n bundles x , ..., x n in R d , and given corresponding price vectors p , ..., p n in R d , one wouldlike to recover nontrivial utility functions u such that x i = arg max x k : p ⊤ i x j ≤ p ⊤ i x i u ( x j ) . In the aﬃrmative, one shall say that observations { ( x i , p i ) } are rationalizable. As shownby Afriat, an aﬃrmative answer to this problem is equivalent to the existence of λ ∈ R n + , λ = 0 and u ∈ R n such that u i − u j ≥ λ i p ⊤ i ( x i − x j ) . (4.2)As it was pointed out in Ekeland and Galichon, 2013, see also Kolesnikova et al., 2013,this problem can be reformulated using an optimal transport problem. Let p be the vectorof uniform probability over { , ..., n } , i.e. p i = 1 /n for i = 1 , ..., n . Let ∆ be the set of λ ∈ R n + such that P ni =1 λ i = 1. For λ ∈ ∆, set W ( λ ) = max π ∈M ( p,p ) X ≤ i,j ≤ n π ij Φ λij (4.3)where Φ λij = λ i p ⊤ i ( x i − x j ). By Monge-Kantorovich duality, we have W ( λ ) = 1 n min  n X i =1 u i + n X j =1 v j : u i + v j ≥ Φ λij  . PPLICATIONS OF OPTIMAL TRANSPORT METHODS TO ECONOMETRICS 13

Note that one has W ( λ ) ≥ π ∗ ij = 1 { i = j } /n in expres-sion (4.3). Assume that W ( λ ) = 0. Then it means that π ∗ ij is optimal. Take a pair ( φ, ψ )which is optimal for the dual problem. Thus by complementary slackness, u i + v i = Φ λii = 0,hence u i = − u i . Thus the dual feasibility condition then implies Afriat’s inequalities (4.2).Conversely, it is quite easy to show that if Afriat’s inequalities are satisﬁed, then W ( λ ) = 0.This provides a particularly simple criterion: observations { ( x i , p i ) } are rationalizable if andonly if min λ ∈ ∆ W ( λ ) = 0which is a convex minimization problem. Further, the subgradient of W ( λ ) is the set of Z ( π ) ∈ R d such that Z i ( π ) = p ⊤ i (cid:16) x i /n − P nj =1 π ij x j (cid:17) , for any π solution of (4.3).5. Conclusion

This short article has hopefully convinced the reader of the growing importance of opti-mal transport methods as part of the standard econometrician’s toolbox. Because it is sointrinsically connected with notions such as linear programming, convex analysis, duality,quantiles, copulas, clustering, graphs, and numerical methods, investing some time in thestudy of this theory can only be fruitful. While this article has kept a focus on econometrics,a number of other economic applications of optimal transport exist, notably in mechanismdesign, labor economics, family economics, and asset pricing. Some of these applicationsare reviewed in

Optimal Transport Methods in Economics

Galichon, 2016.

References

Afriat, S. (1967). The construction of a utility function from demand data.

InternationalEconomic Review , , 67–77.Aurenhammer, F. (1987). Power diagrams: Properties, algorithms and applications. SIAMJournal on Computing , , 1647–1661.Becker, G. (1973). A theory of marriage, part i. Journal of Political Economy , , 813–846.Berry, S. (1994). Estimating discrete-choice models of product diﬀerentiation. RAND Jour-nal of Economics , 242–262.Bosc, D., & Galichon, A. (2014). Extreme dependence for multivariate data.

QuantitativeFinance , , 1187–1199.Brenier, Y. (1987). D´ecomposition polaire et r´earrangement monotone des champs de vecteurs. C.R. Acad. Sci. Paris S´erie I , 805–808.Carlier, G., Chernozhukov, V., & Galichon, A. (2016). Vector quantile regression.

Annalsof Statistics , , 1165–1192.Carlier, G., Dana, R.-A., & Galichon, A. (2012). Pareto eﬃciency for the concave order andmultivariate comonotonicity. Journal of Economic Theory , , 207–229.Charpentier, A., Galichon, A., & Henry, M. (2016). Local utility and risk aversion. Mathe-matics of Operations Research , , 466–476.Chernozhukov, V., Galichon, A., Hallin, M., & Henry, M. (Forthcoming). Monge-kantorovichdepth, quantiles, ranks and signs. Annals of Statistics .Chernozhukov, V., Galichon, A., Henry, M., & Pass, B. (2015).

Single market nonparametricidentiﬁcation of multi-attribute hedonic equilibrium models [working paper].Chiappori, P.-A., McCann, R., & Nesheim, L. (2010). Hedonic price equilibria, stable match-ing, and optimal transport: Equivalence, topology, and uniqueness.

Economic The-ory , , 317–354.Chiong, K., Galichon, A., & Shum, M. (2015). Duality in dynamic discrete choice models. Quantitative Economics , , 83–115.Ekeland, I., & Galichon, A. (2013). The housing problem and revealed preference theory:Duality and an application. Economic Theory , , 425–441. EFERENCES 15

Ekeland, I., Galichon, A., & Henry, M. (2010). Optimal transportation and the falsiﬁabilityof incompletely speciﬁed economic models.

Economic Theory , , 355–374.Ekeland, I., Galichon, A., & Henry, M. (2011). Comonotonic measures of multivariate risks. Mathematical Finance , , 109–132.Galichon, A. (2016). Optimal transport methods in economics . Princeton University Press.Galichon, A., & Henry, M. (2011). Set identiﬁcation in models with multiple equilibria.

Review of Economic Studies , , 1264–1298.Galichon, A., & Henry, M. (2012). Dual theory of choice with multivariate risks. Journal ofEconomic Theory , , 1501–1516.Galichon, A., & Salani´e, B. (2014). Cupid’s invisible hand: Social surplus and identiﬁcationin matching models [working paper].Gretsky, N., Ostroy, J., & Zame, W. (1992). The nonatomic assignment model.

EconomicTheory , , 103–127.Heckman, J. J., Matzkin, R., & Nesheim, L. (2010). Nonparametric identiﬁcation and esti-mation of nonadditive hedonic models. Econometrica , , 1569–1591.Kantorovich, L. (1939). Mathematical methods in the organization and planning of produc-tion. Management Science , , 366–422.Kantorovich, L. (1948). On a problem of monge. Uspekhi Mat. Nauk , , 225–226.Knott, M., & Smith, C. (1984). On the optimal mapping of distributions. Journal of Opti-mization Theory and its Applications , , 39–49.Kolesnikova, A., Kudryavtsevab, O., & Nagapetyanc, T. (2013). Remarks on afriat’s theoremand the monge–kantorovich problem. Journal of Mathematical Economics , , 501–505.Landsberger, M., & Meilijson, I. (1994). Comonotone allocations, bickel lehmann dispersionand the arrow-pratt measure of risk aversion. Annals of Operation Research , , 97–106.Matzkin, R. (2003). Nonparametric estimation of nonadditive random functions. Economet-rica , , 1339–1375.McCann, R. (1995). Existence and uniqueness of monotone measure-preserving maps. DukeMathematical Journal , , 309–323. Monge, G. (1781). M´emoire sur la th´eorie des d´eblais et des remblais.

Histoire de l’acad´emieroyale des sciences de paris (pp. 666–704).R¨uschendorf, L., & Rachev, S. (1990). A characterization of random variables with minimum L -distance. Journal of Multivariate Analysis , , 48–54.Shapley, L., & Shubik, M. (1972). The assignment game i: The core. International Journalof Game Theory , , 111–130.Villani, C. (2003). Topics in optimal transportation . American Mathematical Society.Villani, C. (2009).

Optimal transport, old and new . Springer-Verlag.. Springer-Verlag.