[PDF] A large-scale regularity theory for the Monge-Ampere equation with rough data and application to the optimal matching problem

Abstract

The aim of this paper is to obtain quantitative bounds for solutions to the optimal matching problem in dimension two. These bounds show that up to a logarithmically divergent shift, the optimal transport maps are close to be the identity at every scale. These bounds allow us to pass to the limit as the system size goes to infinity and construct a locally optimal coupling between the Lebesgue measure and the Poisson point process which retains the stationarity properties of the Poisson point process only at the level of second-order differences. Our quantitative bounds are obtained through a Campanato iteration scheme based on a deterministic and a stochastic ingredient. The deterministic part, which can be seen as our main contribution, is a regularity result for Monge-Ampère equations with rough right-hand side. Since we believe that it could be useful in other contexts, we prove it for general space dimensions. The stochastic part is a concentration result for the optimal matching problem which builds on previous work by Ambrosio, Stra and Trevisan.

Full PDF

AA large-scale regularity theory for the Monge-Amp`ereequation with rough data and application to theoptimal matching problem

Michael Goldman ∗ Martin Huesmann † Felix Otto ‡ August 29, 2018

Abstract

The aim of this paper is to obtain quantitative bounds for solutions to the optimalmatching problem in dimension two. These bounds show that up to a logarithmicallydivergent shift, the optimal transport maps are close to be the identity at every scale.These bounds allow us to pass to the limit as the system size goes to inﬁnity andconstruct a locally optimal coupling between the Lebesgue measure and the Poissonpoint process which retains the stationarity properties of the Poisson point processonly at the level of second-order diﬀerences. Our quantitative bounds are obtainedthrough a Campanato iteration scheme based on a deterministic and a stochasticingredient. The deterministic part, which can be seen as our main contribution, isa regularity result for Monge-Amp`ere equations with rough right-hand side. Sincewe believe that it could be useful in other contexts, we prove it for general spacedimensions. The stochastic part is a concentration result for the optimal matchingproblem which builds on previous work by Ambrosio, Stra and Trevisan.

Contents ∗ Universit´e Paris-Diderot, Sorbonne Paris-Cit´e, Sorbonne Universit´e, CNRS, Laboratoire Jacques-LouisLions, LJLL, F-75013 Paris, France, [email protected] † Martin Huesmann, Institut f¨ur Mathematik, Rheinische Friedrich-Wilhelms-Universit¨at Bonn, 53115Bonn, Germany, [email protected] ‡ Max Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany,

[email protected] a r X i v : . [ m a t h . A P ] A ug Preliminaries 11 T . . . . . . . . . . . . . . . . . . . . . . . . . 624.2 Locally optimal couplings between Lebesgue and Poisson . . . . . . . 70 We are interested in the optimal matching problem between the Lebesgue measureand the Poisson point process µ on the torus Q L := (cid:2) − L , L (cid:1) d (i.e. Q L with theperiodized induced metric | · | per from R d ): W , per (cid:18) µ ( Q L ) | Q L | χ Q L , µ (cid:19) , (1.1)where W , per denotes the squared L − Wasserstein distance on Q L with respect to | · | per .This problem and some of its variants such as generalizations to L p − costs or moregeneral reference measures, have been the subject of intensive work in the past thirtyyears (see for instance [30, 10, 6, 26, 27]). As far as we know, essentially all theprevious papers investigating (1.1) were focusing on estimating the mean of (1.1),e.g. [1, 19, 10, 29, 6], or deviation from the mean, e.g. [18, 21], by constructing on theone hand sophisticated couplings whose costs are asymptotically optimal and provingon the other hand ansatz free lower bounds. The only exception is [25] where for d ≥

3, stationary couplings between the Lebesgue measure and a Poison point processon R d minimizing the cost per unit volume are constructed. From these works, in factsince [1], it is understood that d = 2 is the critical dimension for (1.1). Indeed, whilefor d ≥ E L (cid:104) L d W , per (cid:16) µ ( Q L ) | Q L | χ Q L , µ (cid:17)(cid:105) is of order 1, it is logarithmically divergingfor d = 2 (see Section 1.1.1).We focus here on the critical dimension d = 2 and aim at a better description of theoptimal transport maps, that is, of minimizers of (1.1). Building on a large-scaleregularity theory for convex maps solving the Monge-Amp`ere equation ∇ ψ dx = µ, (1.2) hich we develop along the way, we prove that from the macroscopic scale L downto the microscopic scale the solution of (1.1) is close to the identity plus a shift. Herecloseness is measured with respect to a scale-invariant L norm. Our main result isthe following: Theorem 1.1.

Assume that d = 2 . There exists c > such that for each dyadic L ≥ there exists a random variable r ∗ ,L = r ∗ ,L ( µ ) ≥ satisfying the exponentialbound sup L E L (cid:20) exp (cid:18) cr ∗ ,L log 2 r ∗ ,L (cid:19)(cid:21) < ∞ and such that if i r ∗ ,L (cid:28) L there exists x L = x L ( µ ) ∈ Q L with ii | x L | (cid:46) r ∗ ,L log (cid:18) Lr ∗ ,L (cid:19) (1.3) such that if T = T µ,L is the minimizer of (1.1) , then for every r ∗ ,L ≤ (cid:96) ≤ L , thereholds (cid:96) (cid:90) B (cid:96) ( x L ) | T − ( x − x L ) | (cid:46) log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) . (1.4)Note that the bounds in (1.3) and (1.4) are probably not optimal. Indeed, in bothestimates one would rather expect a linear dependence on the logarithms. Similarly,using a similar proof in dimension d ≥ ( L )for the shift x L even though it is expected to be of order one. In order to improve ourbounds, one would need to better capture some cancellation eﬀects. Notice howeverthat the proof of (1.4) lead to the optimal estimate (cf. Remark 4.1)inf ξ ∈ R (cid:96) (cid:90) B (cid:96) ( x L ) | T − ( x − ξ ) | (cid:46) log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) ∀ r ∗ ,L ≤ (cid:96) ≤ L. Let us point out that Theorem 1.1 would also hold for the Euclidean transport prob-lem on Q L . The motivation for considering instead the transport problem on thetorus comes from the good stationarity properties of the optimal transport maps inthis setting. Indeed, since the bound (1.4) is uniform in L one can try to construct acovariant and locally optimal coupling between the Lebesgue measure and the Pois-son point process on R by taking the limit L → ∞ in T µ,L . A similar strategy hasbeen implemented in [25] for d ≥ iii . One of the keyingredients used in that paper, namely the fact that the minimal cost per unit volumeis ﬁnite, is missing for d = 2. The presence of the logarithmically divergent shift x L in (1.4) can be seen as a manifestation of the logarithmic divergence of the minimal i We use the short-hand notation A (cid:28) ε > A ≤ ε . Similarly, A (cid:46) B means that there exists a dimensional constant C > A ≤ CB . ii Here and in the rest of the paper log denotes the natural logarithm iii

Notice that actually in [25] a slightly relaxed version of (1.1) was considered and it is not known(although conjectured) that solutions of (1.1) converge to the unique stationary coupling with minimalcost per unit volume. ost per unit volume. In order to pass to the limit we thus need to renormalize thetransport map by subtracting this shift. Because of this renormalization the limitingmap will loose its stationarity properties which will roughly speaking only survive atthe level of the gradient. Also since we do not have any uniqueness statement for thelimit objects, we will have to pass to the limit in the sense of Young measures.In order to state our second main result, we need more notation. It is easier to pass tothe limit at the level of the Kantorovich potentials rather than for the correspondingtransport maps. Also, since the Lebesgue measure on R is invariant under arbitraryshifts while the Poisson point process on Q L (extended by periodicity to R ) is not,it is more natural to make the shift in the domain and keep the image unchanged.This could also serve as motivation for centering the estimate (1.4) around x L (whichis approximately equal to T − (0)) rather than around 0.To be more precise, denote by (cid:98) ψ = (cid:98) ψ µ,L the Kantorovich potential deﬁned on R and associated to T µ,L , i.e. T µ,L = ∇ (cid:98) ψ µ,L , satisfying (cid:98) ψ µ,L (0) = 0 (see Section 2.3).Deﬁne, ψ µ,L ( x ) := (cid:98) ψ µ,L ( x + x L )with corresponding Legendre dual ψ ∗ µ,L ( y ) = (cid:98) ψ ∗ µ,L ( y ) − x L · y . With this renormal-ization we still have ∇ ψ µ,L µ ( Q L ) | Q L | = µ. Denote the space of all real-valued convex functions ψ : R → R by K and the spaceof all locally ﬁnite point conﬁgurations by Γ. We equip K with the topology of locallyuniform convergence and Γ with the topology obtained by testing against continuousand compactly supported functions. There is a natural action on Γ by R denoted by θ z and given by θ z µ := µ ( · + z ) for z ∈ R . Deﬁne the map Ψ L : Γ → K by µ (cid:55)→ ψ µ,L and denote by P L the Poisson point process on Q L . For each dyadic L ≥ L by q L := ( id, Ψ L ) P L = P L ⊗ δ Ψ L . Then, we have the following result:

Theorem 1.2.

The sequence ( q L ) L of probability measures is tight. Moreover, anyaccumulation point q satisﬁes the following properties:(i) The ﬁrst marginal of q is the Poisson point process P L ;(ii) q almost surely ∇ ψ dx = µ ;(iii) for any h, z ∈ R and f ∈ C b (Γ × C ( R )) there holds (cid:90) Γ ×K f ( µ, D h ψ ∗ ) dq = (cid:90) Γ ×K f ( θ − z µ, D h ψ ∗ ( · − z )) dq, where D h ψ ∗ ( y ) := ψ ∗ ( y + h ) + ψ ∗ ( y − h ) − ψ ∗ ( y ) . art (iii) of Theorem 1.2 says that for any h , under the measure q the random variable( µ, ψ ) (cid:55)→ ( µ, D h ψ ∗ ) is stationary with respect to the action induced by the naturalshifts on Γ and K . Observe that as already pointed out, while the second-orderincrements of the potentials ψ are stationary, the induced couplings ( id, ∇ ψ ) dx arenot. This is due to the necessary renormalization by x L . It is an interesting openproblem to understand whether one can prove that the sequence (Ψ L ) L actuallyconverges and get rid of the Young measures. A slightly weaker open problem is toshow non/uniqueness of the accumulation points of ( q L ) L . The proof of Theorem 1.1 is inspired by (quantitative) stochastic homogenization, inthe sense that it is based on a Campanato iteration scheme which allows to transferthe information that (1.4) holds at the “thermodynamic” scale (here, the scale L ↑ ∞ of the torus) by [1, 6] to scales of order one (here, the scale r ∗ ,L ). This is reminiscentof the approach of Avellaneda and Lin [9] to a regularity theory for (linear) ellipticequations with periodic coeﬃcients: The good regularity theory of the homogenizedoperator, i.e. the regularity theory on the thermodynamic scale, is passed down to thescale of the periodicity. This approach has been adapted by Armstrong and Smart [8]to the case of random coeﬃcients; the approach has been further reﬁned by Gloria,Neukamm and the last author [22] (see also [7]) where the random analogue of thescale of periodicity, and an analogue to r ∗ ,L in this paper, has been introduced andoptimally estimated (incidentally also by concentration-of-measure arguments as inthis paper).The Campanato scheme is obtained by a combination of a deterministic and astochastic argument. The deterministic one is similar in spirit to [23]. It assertsthat if at some scale R > µ B R to µ ( B R ) | B R | χ B R is also small then up to an aﬃne change of variables the excessenergy is well controlled by these two quantities at scale θR for some θ (cid:28)

1. The aimof the stochastic part is to prove that with overwhelming probability, the Euclidean L − Wasserstein distance R W (cid:16) µ B R , µ ( B R ) | B R | (cid:17) is small for every (dyadic) scale R between L and 1 so that the Campanato scheme can indeed be iterated down to themicroscopic scale. Since our proof of the stochastic estimate is based on the resultsof [6] which are stated for cubes, we will actually prove the stochastic estimate oncubes instead of balls. We now describe these two parts separately in some moredetail. We start with the stochastic aspect since it is simpler. For a measure µ on Q L and (cid:96) ≤ L we denote its restriction to Q (cid:96) ⊂ Q L by µ (cid:96) := µ Q (cid:96) . Then, the main stochastic ingredient for the proof of Theorem 1.1 is thefollowing result: Theorem 1.3.

For dyadic L ≥ and µ a Poisson point process on Q L there existsa universal constant c > and a family of random variables r ∗ ,L ≥ satisfying up L E L (cid:20) exp (cid:18) cr ∗ ,L log(2 r ∗ ,L ) (cid:19)(cid:21) < ∞ and such that for every dyadic (cid:96) with r ∗ ,L ≤ (cid:96) ≤ L , (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) . The proof of this result relies on an adaptation of the concentration argument putforward in [6, Remark 4.7]. One of the diﬀerences between this article and [6] isthat in [6] the more classical version of (1.1), namely the matching of the empiricalmeasure of n iid uniformly distributed points X , . . . , X n on the cube Q = [ − , ) d to their reference measure is considered: C n,d := E (cid:34) W (cid:32) χ Q , n n (cid:88) i =1 δ X i (cid:33)(cid:35) = 1 n d E (cid:34) W (cid:32) n χ Q n d , n n (cid:88) i =1 δ n d X i (cid:33)(cid:35) . (1.5)Since the typical distance between nearby points X i and X j is of order n − d it isexpected that C n,d ∼ n − d . However, it turns out that this is only true in d ≥ . Since the seminal work [1], it is known that in dimension two an extra logarithmicfactor appears. In dimension one the correct scaling is of order n so that we cansummarize C n,d ∼  n , d = 1 , cf. [11] log nn , d = 2 , cf. [1] n d , d ≥ , cf. [10 , . Based on a linearization ansatz of the Monge-Amp`ere equation suggested by [13] inthe physics literature, [6] signiﬁcantly strengthened the two-dimensional case tolim n →∞ n log n C n, = 14 π . Additionally, it is remarked in [6, Remark 4.7] that the mass concentrates aroundthe mean. Combining this concentration argument with conditioning on the numberof points of µ in Q L and a Borel-Cantelli argument we show Theorem 1.3. As already alluded to, the deterministic ingredient is one step of a Campanato schemefor solutions of the Monge-Amp`ere equation with arbitrary right-hand side (1.2).Since we believe that this far-reaching generalization of [23, Proposition 4.7] (see also[24]) could have a large range of applications we prove it for arbitrary dimension d ≥

2. Let us point out that while [23] gives an alternative proof of the partial regularityresult for the Monge-Amp`ere equation with ”regular” data previously obtained in[20, 17] (see [16] for a nice informal presentation of this approach), it is unclear ifthat other approach based on maximum principles could also be used in our context.Given for some

R > ⊃ B R and an arbitrary measure µ , denoteby T the optimal transport map between χ Ω and µ and let O ⊃ B R be an open set.We have the following result: heorem 1.4. For every < τ (cid:28) , there exist positive constants ε ( τ ) , C ( τ ) , and < θ < such that if R d +2 (cid:90) B R | T − x | + 1 R d +2 W (cid:18) µ O, µ ( O ) | O | (cid:19) ≤ ε ( τ ) , (1.6) then there exists a symmetric matrix B and a vector b ∈ R d such that | B − Id | + 1 R | b | (cid:46) R d +2 (cid:90) B R | T − x | , and letting ˆ x := B − x , ˆΩ := B − Ω and then ˆ T (ˆ x ) := B ( T ( x ) − b ) and ˆ µ := ˆ T χ ˆΩ d ˆ x, so that ˆ T is the optimal transport map between χ ˆΩ and ˆ µ , we have θR ) d +2 (cid:90) B θR | ˆ T − ˆ x | ≤ τR d +2 (cid:90) B R | T − x | + C ( τ ) R d +2 W (cid:18) µ O, µ ( O ) | O | (cid:19) . (1.7)The only reason for not taking O = B R is that in our application, the control on thedata term R d +2 W (cid:16) µ O, µ ( O ) | O | (cid:17) will be given by Theorem 1.3 which is stated forcubes. Let us stress that since (cid:82) B R | T − x | behaves like a squared H norm in termsof the potentials and since the squared Wasserstein distance behaves like a squared H − norm (cf. [31, Theorem 7.26]), all quantities occur in the estimate (1.7) as ifwe were dealing with a second order linear elliptic equation and looking at squared L -based quantities.Since the estimates in Theorem 1.4 are scale-invariant, it is enough to prove it for R = 1. We then let for notational simplicity E := (cid:90) B | T − x | and D := W (cid:18) µ O, µ ( O ) | O | (cid:19) . The main ingredient for the proof of Theorem 1.4 is the following result, which is thecounterpart of [23, Proposition 4.6] and which states that if E + D (cid:28)

1, that is, if theenergy is small and if the data is close to a constant in the natural W − topology, then T − x is quantitatively close to the gradient ∇ ϕ of a solution to a Poisson equation.This quantiﬁes the well-known fact that the Monge-Amp`ere equation linearizes tothe Poisson equation around the constant density. Proposition 1.5.

For every < τ (cid:28) , there exist positive constants ε ( τ ) and C ( τ ) such that if E + D ≤ ε ( τ ) , then there exists a function ϕ with harmonic gradient in B and such that (cid:90) B | T − ( x + ∇ ϕ ) | (cid:46) τ E + C ( τ ) D and (cid:90) B |∇ ϕ | (cid:46) E. s in [23], Proposition 1.5 is actually proven at the Eulerian (or Benamou-Brenier)level. That is, if we let for t ∈ [0 , T t := (1 − t ) Id + tT , ρ := T t χ Ω , and j := T t T − x ) χ Ω , the couple ( ρ, j ) solvesmin ( ρ,j ) (cid:26)(cid:90) Ω (cid:90) ρ | j | : ∂ t ρ + ∇ · j = 0 , ρ = χ Ω , ρ = µ (cid:27) (1.8)and we show Proposition 1.6.

For every < τ (cid:28) , there exist positive constants ε ( τ ) and C ( τ ) such that if E + D ≤ ε ( τ ) , then there exists a function ϕ with harmonic gradient in B such that (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) τ E + C ( τ ) D (1.9) and (cid:90) B |∇ ϕ | (cid:46) E. As in [23], this is proven by ﬁrst choosing a good radius where the ﬂux of j is wellcontrolled in order to deﬁne ϕ , then obtaining the almost-orthogonality estimate (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) (cid:18)(cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | (cid:19) + τ E + C ( τ ) D (1.10)and ﬁnally constructing a competitor and using the minimality of ( ρ, j ) for (1.8) inorder to estimate the term inside the brackets in (1.10). However, each of these stepsis considerably harder than in [23]. This becomes quite clear considering that byanalogy with [23], letting for R > , f := (cid:82) j · ν where ν is the outward normal to B R , one would like to deﬁne ϕ as a solution of (cid:40) ∆ ϕ = 1 − µ in B R∂ϕ∂ν = f on ∂B R for some well chosen R ∈ ( , ). However if µ is a singular measure ∇ ϕ will typicallynot be in L since this would require L bounds (actually H − would be enough)on f which cannot be obtained from the energy through a Fubini type argumentsince the L ∞ norm of ρ t typically blows up as t →

1. Similar issues were tackledin [6, 26] by molliﬁcation of µ with smooth kernels (the heat and the Mehler kernelrespectively). Here instead we introduce a small time-like parameter τ and workseparately in (0 , − τ ) and in the terminal layer (1 − τ, , − τ ), we take care of the ﬂux going through ∂B R in that time interval. Weﬁrst modify the deﬁnition of f and let f := (cid:82) − τ j · ν , and then change ϕ so thatit connects in B R the constant density equal to 1 to the constant density equal to1 − | B R | (cid:82) ∂B R f , i.e. ϕ solves∆ ϕ = 1 | B R | (cid:90) ∂B R f in B R . (1.11) egarding the Neumann boundary conditions, we face here the problem that eventhough ρ ∈ L ∞ ( B × (0 , − τ )), its L ∞ bound blows up as τ →

0. This leads to L bounds on f which are not uniform in τ . To overcome this diﬃculty, we needto replace f by a better behaved density on ∂B R . This is the role of Lemma 3.4.Treating separately the incoming and outgoing ﬂuxes f ± , we construct densities ρ ± on ∂B R with (cid:90) ∂B R ρ ± (cid:46) E and W ( ρ ± , f ± ) (cid:46) E d +3 d +2 . (1.12)The densities ρ ± can be seen as rearrangements of f ± through projections on ∂B R .Considering the time-dependent version of the Lagrangian problem, since E (cid:28) ∂B R in (0 , − τ ) must come from a small neighborhood of ∂B R attime 0. A key point in deriving (1.12) is that at time 0 the density is well-behaved(since it is constant) and thus the number of particles coming from such a smallneighborhood of ∂B R is under control. The estimate in (1.12) on the Wassersteindistance between ρ ± and f ± is important in view of the construction of a competitor.Indeed, this indicates that if we know how to construct a competitor having ρ ± asboundary ﬂuxes, we will then be able to modify it into a competitor with the correct f ± boundary conditions (see in particular Lemma 2.6). We then complement (1.11)with the boundary conditions ∂ϕ∂ν = ρ + − ρ − on ∂B R . The almost-orthogonality estimate (1.10) is proven in Proposition 3.6. It is readilyseen that assuming for simplicity that R = 1 is the good radius, (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) (cid:18)(cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | (cid:19) + (cid:90) B (cid:18)(cid:90) − τ ρ − (cid:19) |∇ ϕ | + (cid:90) B ϕρ − τ + (cid:90) ∂B ϕ (cid:2) f − ( ρ + − ρ − ) (cid:3) + τ E. (1.13)While in [23] the ﬁrst term in the second line was easily estimated since in that case,up to a small error we had ρ ≤

1, we need here a more delicate argument. In orderto estimate the second term, we use that, again up to choosing a good radius, wemay assume that W (cid:18) ρ − τ B , ρ − τ ( B ) | B | χ B (cid:19) (cid:46) τ E + D, (1.14)see Lemma 3.5. Ignoring issues coming from the ﬂux through ∂B , and thus assumingthat ρ − τ ( B ) | B | = µ ( B ) | B | = µ ( O ) | O | (where O ⊃ B is the open set in the deﬁnition of D ), (1.14) follows from W ( ρ − τ B , µ B ) (cid:46) τ E by displacement interpolation, W ( µ B , µ ( B ) | B | ) (cid:46) D by deﬁnition and triangle inequality. The last term in (1.13)is estimated thanks to the W estimate given in (1.12).Let us ﬁnally describe the construction of the competitor given in Proposition 3.7. Asexplained in the beginning of this discussion, we employ a diﬀerent strategy for the ime intervals (0 , − τ ) and (1 − τ, , − τ ), forgetting the issue of connecting ρ ± to f ± , we mostly take as competitor( (cid:101) ρ, (cid:101) j ) = (cid:18) − t − τ | B | (cid:90) ∂B f , ∇ ϕ (cid:19) + ( s, q ) , where ( s, q ) are supported in the annulus B \ B − r × (0 , − τ ) for some boundarylayer size r (cid:28) s = s − τ = 0 and q · ν = j · ν − f on ∂B × (0 , − τ ). The existence of such a couple ( s, q ) satisfying the appropriateenergy estimates is given by [23, Lemma 3.4] which in turn is inspired by a similarconstruction from [2]. In the terminal time layer (1 − τ, − | B | (cid:82) ∂B f to the measure µ . The outgoing ﬂux is easily treated by pre-placing on ∂B the particles which should leave the domain in (1 − τ, ρ, j ) as competitor. Finally, the remaining part ofthe measure µ (cid:48) ≤ µ which was not coming from particles entering ∂B in (1 − τ, − | B | (cid:82) ∂B f thanks to the estimate W (cid:18) µ (cid:48) , µ (cid:48) ( B ) | B | (cid:19) (cid:46) τ E + D, which is obtained as (1.14) in Lemma 3.5. The plan of the paper is the following. In Section 2 we ﬁrst set up some notationand then prove a few more or less standard elliptic estimates. We then write inSection 2.3 a quick reminder on optimal transportation. In Section 2.4, we givethe deﬁnition and ﬁrst properties of the Poisson point process and then prove ourmain concentration estimates (see Theorem 2.8 and Theorem 2.10). Section 3 is thecentral part of the paper and contains the proof of Theorem 1.4. We ﬁrst explainin Section 3.1 how to choose a good radius before proving Proposition 1.6. Thisis a consequence of Proposition 3.6 and Proposition 3.7 which contain the proofof the almost-orthogonality property (1.10) and the construction of a competitorrespectively. In Section 4, we combine the stochastic and deterministic ingredients toperform the Campanato iteration and prove both the quantitative bounds of Theorem1.1 and perform the construction of the locally optimal coupling between Lebesgueand Poisson given in Theorem 1.2.

Acknowledgements

MH gratefully acknowledges partial support by the DFG through the CRC 1060 “TheMathematics of Emerging Eﬀects” and by the Hausdorﬀ Center for Mathematics.MG and MH thank the Max Planck Institute MIS for its warm hospitality. Preliminaries

In this paper we will use the following notation. The symbols ∼ , (cid:38) , (cid:46) indicateestimates that hold up to a global constant C , which typically only depends on thedimension d . For instance, f (cid:46) g means that there exists such a constant with f ≤ Cg , f ∼ g means f (cid:46) g and g (cid:46) f . An assumption of the form f (cid:28) ε >

0, typically only depending on dimension, such that if f ≤ ε ,then the conclusion holds.We write log for the natural logarithm. We denote by H k the k − dimensional Haus-dorﬀ measure. For a set E , ν E will always denote the external normal to E . Whenclear from the context we will drop the explicit dependence on the set. We write | E | for the Lebesgue measure of a set E and χ E for the indicator function of E . Whenno confusion is possible, we will drop the integration measures in the integrals. Sim-ilarly, we will often identify, if possible, measures with their densities with respect tothe Lebesgue measure. For R > x ∈ R d , B R ( x ) denotes the ball of radius R centered in x . When x = 0, we will simply write B R for B R (0). We denote thegradient (resp. the Laplace-Beltrami operator) on ∂B R by ∇ bdr (resp. ∆ bdr ). For L >

0, we denote by Q L := (cid:2) − L , L (cid:1) d the cube of side length L . We start by collecting a few more or less standard elliptic estimates which we willneed later on.

Lemma 2.1.

For f ∈ L ( ∂B ) let ϕ be the (unique) solution of  ∆ ϕ = | B | (cid:82) ∂B f in B ∂ϕ∂ν = f on ∂B , with (cid:82) ∂B ϕ = 0 , then letting p := dd − , (cid:18)(cid:90) B |∇ ϕ | p (cid:19) p (cid:46) (cid:18)(cid:90) ∂B f (cid:19) . (2.1) Moreover, for every < r ≤ , (cid:90) B \ B − r |∇ ϕ | (cid:46) r (cid:90) ∂B f . (2.2) Proof.

Replacing ϕ by ϕ − | x | d | B | (cid:82) ∂B f , we may assume that (cid:82) ∂B f = 0.By Pohozaev’s identity (see [23]) and Poincar´e’s inequality, we have that ϕ ∈ H ( ∂B )with (cid:90) ∂B |∇ ϕ | (cid:46) (cid:90) ∂B f . (2.3) stimate (2.2) follows from (2.3) together with the sub-harmonicity of |∇ ϕ | in theform (cid:90) ∂B r |∇ ϕ | ≤ (cid:90) ∂B |∇ ϕ | for r ≤ . We are just left to prove (2.1). Since by (2.3) we have (cid:82) ∂B |∇ ϕ | (cid:46) (cid:82) ∂B f , andsince ∇ ϕ is harmonic, it suﬃces to show that for every harmonic function ϕ with (cid:82) ∂B ϕ = 0 and thus (cid:82) B ϕ = 0, (cid:18)(cid:90) B | ϕ | p (cid:19) p (cid:46) (cid:18)(cid:90) ∂B ϕ (cid:19) . (2.4)The argument for (2.4) roughly goes as follows: By L -based regularity theory, the H ( B )-norm of ϕ is estimated by the L ( ∂B )-norm of its Dirichlet data, so that(2.4) reduces to a fractional Sobolev inequality on B . If one wants to avoid fractionalSobolev norms, in view of their various deﬁnitions on bounded domains, one needs toconstruct an extension ¯ ϕ of ϕ to the (semi-inﬁnite) cylinder B × (0 , ∞ ), preserving (cid:82) B ¯ ϕ = 0 and with the H ( B × (0 , ∞ ))-norm of ¯ ϕ estimated by the H ( B )-normof ϕ , which combines to (cid:18)(cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) (cid:19) (cid:46) (cid:18)(cid:90) ∂B ϕ (cid:19) . (2.5)It then remains to appeal to the (Sobolev-type) trace estimate (cid:18)(cid:90) B | ϕ | p (cid:19) p (cid:46) (cid:18)(cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) (cid:19) . (2.6)For the sake of completeness, we now give the arguments for (2.5) and (2.6). Startingwith (2.5), we ﬁrst argue that it suﬃces to consider the case where ϕ | ∂B is aneigenfunction of the Laplace-Beltrami operator − ∆ bdr , say for eigenvalue λ ≥ ϕ ( x ) = r α ϕ (ˆ x ) with r := | x | , ˆ x := xr , α ( d − α ) = λ, (2.7)which follows from ∆ = r d − ∂ r r d − ∂ r + r ∆ bdr , and deﬁne the extension¯ ϕ ( x, t ) = exp( − αt ) ϕ ( x ) . With ¯ ϕ (cid:48) being another function of this form we have with ∇ bdr ϕ denoting the tan-gential part of the gradient of ϕ ( ∇ ¯ ϕ · ∇ ¯ ϕ (cid:48) + ∂ t ¯ ϕ∂ t ¯ ϕ (cid:48) )( x, t )= exp( − ( α + α (cid:48) ) t ) r α + α (cid:48) − (cid:0) ∇ bdr ϕ · ∇ bdr ϕ (cid:48) + αα (cid:48) (1 + r ) ϕϕ (cid:48) (cid:1) (ˆ x ) , nd thus by integration by parts on ∂B we have for every r, t (cid:90) ∂B r ×{ t } ∇ ¯ ϕ · ∇ ¯ ϕ (cid:48) + ∂ t ¯ ϕ∂ t ¯ ϕ (cid:48) = exp( − ( α + α (cid:48) ) t ) r α + α (cid:48) − ( λ + αα (cid:48) (1 + r )) (cid:90) ∂B ϕϕ (cid:48) . From this we learn that the L ( ∂B )-orthogonality of the eigenspaces transmits tothe extensions; hence we may indeed restrict to an eigenfunction. Integrating in ( r, t )the last identity for ϕ (cid:48) = ϕ we obtain (cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) = 4 α + λ + λ α α − (cid:90) ∂B ϕ . Since (2.7) implies that λ (cid:46) α + 1, we get (2.5).We now turn to (2.6), which we deduce from the (mean-value zero) Sobolev estimate (cid:18)(cid:90) B (cid:90) ∞ | ¯ ϕ | q (cid:19) q (cid:46) (cid:18)(cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) (cid:19) where q = 2( d + 1) d − , (2.8)which holds because the analogue estimate holds on every B × ( n − , n ), n ∈ N ,since (cid:82) B × ( n − ,n ) ¯ ϕ = 0. From | ddt (cid:82) B | ¯ ϕ | p | ≤ p (cid:82) B | ¯ ϕ | p − | ∂ t ¯ ϕ | and Cauchy-Schwarz’sinequality (using 2( p −

1) = q ) we have (cid:90) ∞ (cid:12)(cid:12)(cid:12)(cid:12) ddt (cid:90) B | ¯ ϕ | p (cid:12)(cid:12)(cid:12)(cid:12) ≤ p (cid:18)(cid:90) B (cid:90) ∞ | ¯ ϕ | q (cid:19) (cid:18)(cid:90) B (cid:90) ∞ ( ∂ t ¯ ϕ ) (cid:19) . Therefore, using that (cid:82) B | ¯ ϕ | p → t → ∞ , we get (cid:18)(cid:90) B | ϕ | p (cid:19) p (cid:46) (cid:18)(cid:90) B (cid:90) ∞ | ¯ ϕ | q (cid:19) p (cid:18)(cid:90) B (cid:90) ∞ ( ∂ t ¯ ϕ ) (cid:19) p (2.8) (cid:46) (cid:18)(cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) (cid:19) q p + p = (cid:18)(cid:90) B (cid:90) ∞ |∇ ¯ ϕ | + ( ∂ t ¯ ϕ ) (cid:19) , that is (2.6).For the choice of a good radius (see Lemma 3.5 below) we will need the following nottotally standard elliptic estimate. Lemma 2.2.

For every c > and every ( z, f ) , with ≤ z ≤ c and Spt z ⊂ B \ B ,the unique solution ϕ (up to additive constants) of  ∆ ϕ = z − | B | (cid:16)(cid:82) B z − (cid:82) ∂B f (cid:17) in B ∂ϕ∂ν = f on ∂B satisﬁes (cid:90) B |∇ ϕ | (cid:46) (cid:90) ∂B f + c (cid:90) B (1 − | x | ) z. roof. Without loss of generality, we may assume that (cid:82) B ϕ = 0 and that (cid:82) ∂B f + (cid:82) B (1 − | x | ) z < ∞ , otherwise there is nothing to prove. Moreover, by scaling wemay assume that c = 1. Using integration by parts, the trace inequality for Sobolevfunctions together with the Poincar´e inequality for functions of mean zero, (cid:90) B |∇ ϕ | = (cid:90) ∂B ϕf − (cid:90) B ϕ ∆ ϕ ≤ (cid:18)(cid:90) ∂B ϕ (cid:19) (cid:18)(cid:90) ∂B f (cid:19) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B zϕ (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:18)(cid:90) B |∇ ϕ | (cid:19) (cid:18)(cid:90) ∂B f (cid:19) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B zϕ (cid:12)(cid:12)(cid:12)(cid:12) . Using Young’s inequality, it is now enough to prove that (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B zϕ (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:18)(cid:90) B |∇ ϕ | (cid:19) (cid:18)(cid:90) B (1 − | x | ) z (cid:19) . (2.9)For ω ∈ ∂B , letting z ( ω ) := (cid:90) z ( rω ) r d − dr, we claim that (cid:90) ∂B z (cid:46) (cid:90) B (1 − | x | ) z. (2.10)Indeed, momentarily ﬁxing ω ∈ ∂B and setting ψ ( r ) := r d − z ( rω ) for r ∈ [0 , ≤ ψ ≤ (cid:90) ψ = z ( ω ) , so that for almost every ω ∈ ∂B , (cid:90) (1 − r ) r d − z ( rω ) ≥ min ≤ (cid:101) ψ ≤ (cid:82) (cid:101) ψ = z ( ω ) (cid:90) (1 − r ) (cid:101) ψ ( r ) (cid:38) z ( ω ) , where the last inequality follows since the minimizer ofmin ≤ (cid:101) ψ ≤ (cid:82) (cid:101) ψ = z ( ω ) (cid:90) (1 − r ) (cid:101) ψ ( r )is given by the characteristic function of (1 − z ( ω ) , pt z ⊂ B \ B , we can thus write (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B zϕ (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) ∂B ϕz (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) ∂B (cid:90) ( ϕ ( rω ) − ϕ ( ω )) z ( rω ) r d − drdω (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:18)(cid:90) ∂B ϕ (cid:19) (cid:18)(cid:90) ∂B z (cid:19) + (cid:90) ∂B (cid:90) | ϕ ( rω ) − ϕ ( ω ) | z ( rω ) r d − drdω (2.10) (cid:46) (cid:18)(cid:90) B |∇ ϕ | (cid:19) (cid:18)(cid:90) B (1 − | x | ) z (cid:19) + (cid:90) ∂B (cid:90) | ϕ ( rω ) − ϕ ( ω ) | z ( rω ) r d − drdω, where in the last line we used once more that (cid:82) ∂B ϕ (cid:46) (cid:82) B |∇ ϕ | . Since for r ∈ ( , | ϕ ( rω ) − ϕ ( ω ) | ≤ (cid:32)(cid:90) | ∂ r ϕ ( sω ) | ds (cid:33) (cid:46) (cid:18)(cid:90) | ∂ r ϕ ( sω ) | s d − ds (cid:19) , estimate (2.9) follows from (cid:90) ∂B (cid:90) | ϕ ( rω ) − ϕ ( ω ) | z ( rω ) r d − drdω (cid:46) (cid:90) ∂B (cid:90) (cid:18)(cid:90) | ∂ r ϕ ( sω ) | s d − ds (cid:19) z ( rω ) r d − drdω = (cid:90) ∂B (cid:18)(cid:90) | ∂ r ϕ ( sω ) | s d − ds (cid:19) z ( ω ) dω ≤ (cid:18)(cid:90) ∂B (cid:90) | ∂ r ϕ ( sω ) | s d − ds (cid:19) (cid:18)(cid:90) ∂B z (cid:19) (2.10) (cid:46) (cid:18)(cid:90) B |∇ ϕ | (cid:19) (cid:18)(cid:90) B (1 − | x | ) z (cid:19) . In order to set up notation, let us quickly recall some well known facts about optimaltransportation. Much more can be found for instance in the books [31, 32, 5, 28] toname just a few. We will always work here with transportation between (multiples)of characteristic functions and arbitrary measures so that we restrict our presentationto this setting.For a measure Π on R d × R d we denote its marginals by Π and Π , i.e. Π ( A ) =Π( A × R d ) , Π ( A ) = Π( R d × A ) . For a given bounded set Ω, a positive constant Λand a measure µ with compact support and such that µ ( R d ) = Λ | Ω | any measure on R d × R d with marginals Π = Λ χ Ω and Π = µ is called a transport plan orcoupling between Λ χ Ω and µ . We deﬁne the Wasserstein distance between Λ χ Ω and µ as W (Λ χ Ω , µ ) := min Π =Λ χ Ω , Π = µ (cid:90) R d × R d | x − y | d Π = min T χ Ω = µ (cid:90) Ω | T − x | Λ dx. (2.11)By Brenier’s Theorem [31, Theorem 2.12], the minimizer of the right-hand side of(2.11) exists, is called optimal transport map, and is uniquely deﬁned a.e. on Ω asthe gradient of a convex map ψ . Conversely, for every convex map ψ , every Λ > ∇ ψ is the solution of (2.11) for µ := ∇ ψ χ Ω (see [31,Theorem 2.12] again).By [31, Theorem 5.5], we have the time-dependent representation of optimal trans-port W (Λ χ Ω , µ ) = min X (cid:26)(cid:90) Ω (cid:90) | ˙ X ( x, t ) | Λ dx : X ( x,

0) = x, X ( · , χ Ω = µ (cid:27) (2.12)and if T is the solution of (2.11), the minimizer of (2.12) is given by X ( x, t ) = T t ( x ) := (1 − t ) x + tT ( x ) which are straight lines. We will often drop the argument x and write X ( t ) = X ( x, t ).As in [23], a central point for our analysis is the Eulerian version of optimal trans-portation, also known as the Benamou-Brenier formulation (see for instance [31,Theorem 8.1] or [5, Chapter 8]). It states that W (Λ χ Ω , µ ) = min ( ρ,j ) (cid:26)(cid:90) R d (cid:90) ρ | j | : ∂ t ρ + ∇ · j = 0 , ρ (0) = Λ χ Ω and ρ (1) = µ (cid:27) , (2.13)where the continuity equation and the boundary data are understood in the distri-butional sense, i.e. for every ζ ∈ C c ( R d × [0 , (cid:90) R d (cid:90) ∂ t ζρ + ∇ ζ · j = (cid:90) R d ζ ( x, dµ − (cid:90) R d ζ ( x, χ Ω ( x ) dx, (2.14)and where (cid:90) R d (cid:90) ρ | j | = (cid:90) R d (cid:90) | v | dρ if j (cid:28) ρ with djdρ = v and inﬁnity otherwise (see [4, Theorem 2.34]). Let us point outthat in particular, the admissible measures for (2.13) are allowed to contain singularparts with respect to the Lebesgue measure. We also note that if K ⊂ R d is acompact set and if ( ρ, j ) are measures on K × [0 , (cid:90) K (cid:90) ρ | j | = sup ξ ∈ C ( K × [0 , , R d ) (cid:90) K (cid:90) ξ · j − | ξ | ρ. (2.15) f we let for t ∈ [0 , ρ t := T t χ Ω and j t := T t T − Id )Λ χ Ω ] , (2.16)then j is absolutely continuous with respect to ρ and ( ρ, j ) is the minimizer of (2.13)(see [31, Theorem 8.1] or [5, Chapter 8] and [28, Proposition 5.32] for the uniqueness).Notice that by [31, Proposition 5.9], for t ∈ [0 , ρ t and j t are absolutely continuouswith respect to the Lebesgue measure and ρ | j | agrees with its pointwise deﬁnition.By Alexandrov’s Theorem [32, Theorem 14.25], T is diﬀerentiable a.e. and by [31,Theorem 4.8] for t ∈ [0 , ρ t ( T t ( x )) det ∇ T t ( x ) = Λ χ Ω ( x ) (2.17)holds a.e. We say that a map T is monotone if for a.e. ( x, y ), ( T ( x ) − T ( y )) · ( x − y ) ≥ L ∞ bound for monotonemaps proven in iv [23, Lemma 4.1]. Lemma 2.3.

Let T be a monotone map. Let R > be such that R d +2 (cid:82) B R | T − x | (cid:28) . Then sup B R | T − x | + sup B R dist( y, T − ( y )) (cid:46) R (cid:18) R d +2 (cid:90) B R | T − x | (cid:19) d +2 . (2.18) Moreover, letting for t ∈ [0 , , T t = (1 − t ) Id + tT , T − t ( B R ) ⊂ B R . (2.19)Let us show how together with displacement convexity this implies an L ∞ bound for( ρ t , j t ). Lemma 2.4.

Let

Λ = 1 and assume that B ⊂ Ω and E := (cid:82) B | T − x | (cid:28) , where T is the optimal transport map for (2.11) . Then, for a.e. < t < , if ( ρ t , j t ) isgiven by (2.16) , sup B ρ t ≤ − t ) d and sup B | j t | (cid:46) E d +2 − t ) d . (2.20) Moreover, (cid:90) B ρ t | j t | ≤ E and (cid:90) B ρ t (cid:46) . (2.21) Proof.

We start by proving (2.20). The estimate on ρ is a direct consequence ofdisplacement convexity: By concavity of det d on positive symmetric matrices,det d ( ∇ T t ) ≥ (1 − t )det d Id + t det d ∇ T ≥ (1 − t ) (2.22) iv This bound was proven there for optimal transport maps but a quick inspection of the proof showsthat only monotony is used. Similarly, it is stated there for R = 1 but a simple rescaling gives the presentversion of the estimate. nd thus by (2.17), ρ t ( x ) = 1det ∇ T t ( T − t ( x )) ≤ − t ) d . (2.23)We turn to the estimate on j . For ξ ∈ C c ( B , R d ), and t ∈ (0 , (cid:90) B ξ · j t (2.16) = (cid:90) T − t ( B ) ξ ( T t ) · ( T − x ) ≤ sup T − t ( B ) | T − x | (cid:90) T − t ( B ) | ξ ( T t ) | (2.19)&(2.18) (cid:46) E d +2 (cid:90) B | ξ | ρ t (2.23) (cid:46) E d +2 − t ) d (cid:90) B | ξ | . Estimate (2.21) then follows from (2.19): (cid:90) B ρ t | j t | = (cid:90) T − t ( B ) | T − x | ≤ (cid:90) B | T − x | = E and (cid:90) B ρ t = | T − t ( B ) | (cid:46) . Before closing this section, let us spend a few words about optimal transportationon the torus and on the sphere since both problems will appear later on. Let usstart with the periodic setting. For

L >

0, we let Q L := [ − L , L ) d be the centeredcube of side length L . We say that a measure µ on R d is Q L − periodic v if for every z ∈ ( L Z ) d , and every measurable set A , µ ( A + z ) = µ ( A )so that we may identify measures on the ﬂat torus of size L > Q L − periodicmeasures on R d . If µ is a Q L − periodic measure and Λ := µ ( Q L ) L d , then we can deﬁne W , per (Λ , µ ) := min T µ (cid:90) Q L | T − x | Λ dx, (2.24)where | · | per denotes the distance on T L i.e. | x − y | per = min z ∈ ( L Z ) d | x − y + z | .By [31, Theorem 2.47] (see also [14, 3] for a simpler proof), there exists a unique (upto additive constants) convex function ψ on R d such that if T is the unique solutionof (2.24), then for x ∈ Q L , T ( x ) = ∇ ψ ( x ) and for ( x, z ) ∈ R d × ( L Z ) d , ∇ ψ ( x + z ) = ∇ ψ ( x ) + z. (2.25) v when it is clear from the context we will simply call them periodic measures emark 2.5. We will often identify T and ∇ ψ . Let us point out that although T is deﬁned only Lebesgue a.e., it will be sometimes useful to consider a pointwisedeﬁned map which we then take to be an arbitrary but ﬁxed measurable selection ofthe subgradient ∂ψ of ψ . Notice that of course for every Q L − periodic measure µ , W , per (Λ , µ ) ≤ W (Λ χ Q L , µ Q L ) . Finally for f and f two non-negative densities on ∂B with (cid:82) ∂B f = (cid:82) ∂B f , wedeﬁne W ∂B ( f , f ) := min T f = f (cid:90) ∂B d ∂B ( T ( x ) , x ) df , (2.26)where d ∂B is the geodesic distance on ∂B . Let us point out that a minimizerexists by McCann’s extension of Brenier’s Theorem [31, Theorem 2.47]. Notice that W ∂B ( f , f ) is comparable to the Wasserstein distance in R d between f H d − ∂B and f H d − ∂B , that is W ( f , f ) ≤ W ∂B ( f , f ) (cid:46) W ( f , f ) . (2.27)As for (2.12), we have the time-dependent formulation [31, Theorem 5.6] W ∂B ( f , f ) = min X (cid:26)(cid:90) ∂B (cid:90) | ˙ X ( x, t ) | df : X ( x,

0) = x, X ( · , f = f (cid:27) , (2.28)and the Benamou-Brenier formulation [32, Theorem 13.8] W ∂B ( f , f ) = min ( ρ,j ) (cid:26)(cid:90) ∂B (cid:90) ρ | j | : ∂ t ρ + ∇ bdr · j = 0 , ρ (0) = f and ρ (1) = f (cid:27) , (2.29)where we stress that j is tangent to ∂B . Even though it is more delicate thanin the Euclidean case, the analog of (2.17) also holds in this case. Indeed, by [32,Theorem 13.8] and [32, Theorem 11.1] (see also [15, Theorem 4.2] and [15, Lemma6.1] for more details) if vi T ( x ) = exp x ( ∇ ψ ( x )) is the minimizer of (2.26), lettingfor t ∈ [0 , T t ( x ) := exp x ( t ∇ ψ ( x )) and then ρ t := T t f , we have that ρ t is aminimizer of (2.29) and for every t ∈ [0 ,

1] and a.e. x ∈ ∂B , the Jacobian equation ρ t ( T t ( x )) J t ( x ) = f ( x ) (2.30)holds, with J t the Jacobian determinant (see for instance [15, Lemma 6.1] for itsdeﬁnition). The main point for us is that similarly to (2.22), it satisﬁes by [32,Theorem 14.20] (see also [15, Lemma 6.1] for a statement closer to ours) J d − t ≥ (1 − t ) + tJ d − , (2.31)where we used the fact that since the sphere has positive Ricci curvature, its volumedistortion coeﬃcients are larger than one.We ﬁnally prove a simple lemma which will be useful in the construction of competi-tors in Proposition 3.7 below. vi here exp x denotes the exponential map on ∂B emma 2.6. Let f and f be two non-negative densities on ∂B of equal mass. Forevery ≤ a < b ≤ and ≤ c ≤ d ≤ with a < c and b < d , there exists ( ρ, j ) supported on ∂B × [ a, d ] such that for every ζ ∈ C ( ∂B × [0 , (cid:90) ∂B (cid:90) ∂ t ζρ + ∇ bdr ζ · j = 1 d − c (cid:90) ∂B (cid:90) dc ζdf − b − a (cid:90) ∂B (cid:90) ba ζdf (2.32) and (cid:90) ∂B (cid:90) ρ | j | (cid:46) W ∂B ( f , f )( d − b ) − ( c − a ) log d − bc − a , (2.33) with the understanding that for c = d , d − c (cid:90) ∂B (cid:90) dc ζdf = (cid:90) ∂B ζ ( · , c ) df and for d − b = c − a , d − b ) − ( c − a ) log d − bc − a = 1 c − a . (2.34) Proof.

For x ∈ ∂B and t ∈ [0 , X ( x, t ) ∈ ∂B be the minimizer of (2.28), i.e.for ζ ∈ C ( ∂B ), (cid:90) ∂B ζ ( X ( x, df = (cid:90) ∂B ζdf and W ∂B ( f , f ) = (cid:90) ∂B (cid:90) | ˙ X | df . (2.35)Let ψ be the aﬃne function deﬁned on [ a, b ] through ψ ( a ) = c and ψ ( b ) = d and let then for t ∈ [ s, ψ ( s )] X s ( x, t ) := X (cid:18) x, t − sψ ( s ) − s (cid:19) , so that X s ( x, s ) = x and X s ( x, ψ ( s )) = X ( x, . (2.36)We then let ρ be the non-negative measure on ∂B × [0 ,

1] deﬁned for ζ ∈ C ( ∂B × [0 , (cid:90) ∂B (cid:90) ζdρ := 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s ζ ( X s ( x, t ) , t ) dtdsdf and j the R d − valued measure deﬁned for ξ ∈ C ( ∂B × [0 , , R d ) by (cid:90) ∂B (cid:90) ξ · dj := 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s ξ ( X s ( x, t ) , t ) · ˙ X s ( x, t ) dtdsdf . t is readily seen that with this deﬁnition j (cid:28) ρ . Let us establish (2.32). For ζ ∈ C ( ∂B × [0 , (cid:90) ∂B (cid:90) ∂ t ζρ + ∇ bdr ζ · j = 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s ∂ t ζ ( X s , t ) + ∇ ζ ( X s , t ) · ˙ X s dtdsdf = 1 b − a (cid:90) ∂B (cid:90) ba ζ ( X s ( x, ψ ( s )) , ψ ( s )) − ζ ( X s ( x, s ) , s ) dsdf = 1 b − a (cid:90) ∂B (cid:90) ba ζ ( X ( x, , ψ ( s )) − ζ ( x, s ) dsdf = (cid:90) ∂B (cid:90) ba b − a ζ ( x, ψ ( s )) df ds − (cid:90) ∂B (cid:90) ba b − a ζ ( x, s ) dsdf = (cid:90) ∂B (cid:90) dc d − c ζdf d ˆ s − (cid:90) ∂B (cid:90) ba b − a ζdsdf , where we made the change of variables ˆ s = ψ ( s ) in the last equality.We now turn to (2.33). Using (2.15) and the deﬁnition of ( ρ, j ), (cid:90) ∂B (cid:90) ρ | j | = sup ξ ∈ C ( ∂B × [0 , , R d ) (cid:90) ∂B (cid:90) ξ · j − | ξ | ρ = sup ξ ∈ C ( ∂B × [0 , , R d ) b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s ξ ( X s ( x, t ) , t ) · ˙ X s ( x, t ) − | ξ ( X s ( x, t ) , t ) | dtdsdf ≤ b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s sup ξ ∈ R d (cid:18) ξ · ˙ X s ( x, t ) − | ξ | (cid:19) dtdsdf = 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s | ˙ X s ( x, t ) | dtdsdf = 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) s ψ ( s ) − s ) (cid:12)(cid:12)(cid:12)(cid:12) ˙ X (cid:18) x, t − sψ ( s ) − s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) dtdsdf t = t − sψ ( s ) − s = 1 b − a (cid:90) ∂B (cid:90) ba (cid:90) ψ ( s ) − s | ˙ X ( x, ˆ t ) | d ˆ tdsdf = W ∂B ( f , f ) b − a (cid:90) ba dsψ ( s ) − s . Let us ﬁnally estimate b − a (cid:82) ba ψ ( s ) − s . By deﬁnition of ψ ,1 b − a (cid:90) ba ψ ( s ) − s = 1 b − a (cid:90) ba c − s + d − cb − a ( s − a ) s =(1 − t ) a + tb = (cid:90) − t )( c − a ) + t ( d − b )= 1( d − b ) − ( c − a ) log d − bc − a , here we take as convention (2.34) if d − b = c − a . Let Γ be the set of all locally ﬁnite counting measures on R d Γ = (cid:40) µ : µ = (cid:88) i δ y i , y i ∈ R d , µ ( K ) < ∞ , ∀ K compact (cid:41) , where Γ is equipped with the σ -ﬁeld F generated by the mappings µ (cid:55)→ µ ( A ) forBorel sets A ⊂ R d . We say that ( µ, P ) (or simply µ ) is a Poisson point process withintensity measure Lebesgue (or simply a Poisson point process) if P is a probabilitymeasure on Γ such that(i) If A , . . . , A k are disjoint Borel sets, then µ ( A ) , . . . , µ ( A k ) are independentinteger valued random variables;(ii) for any Borel set A with | A | < ∞ , the random variable µ ( A ) has a Poissondistribution with parameter | A | i.e. for every n ∈ N , P [ µ ( A ) = n ] = exp( −| A | ) | A | n n ! . (2.37)For a set Ω ⊂ R d , we deﬁne the Poisson point process on Ω as the restriction of thePoisson point process on R d to Ω. It could be equivalently deﬁned through properties( i ) and ( ii ) above restricted to subsets of Ω.We let θ : R d × Γ → Γ be the shift operator, that is for ( z, µ ) ∈ R d × Γ and a Borelset A ⊂ R d , θ z µ ( A ) := µ ( A + z ) , (2.38)which we write shortly as θ z µ ( · ) = µ ( · + z ). Moreover, we note that P is stationaryin the sense that it is invariant under the action of θ, i.e. P ◦ θ = P .For (cid:96) >

0, we recall that Q (cid:96) = [ − (cid:96) , (cid:96) ) d . The optimal matching problem consists inunderstanding the behavior as (cid:96) → ∞ of vii W (cid:18) µ Q (cid:96) , µ ( Q (cid:96) ) (cid:96) d χ Q (cid:96) (cid:19) (2.39)together with the properties of the corresponding optimal transport maps. We willuse as shorthand notation µ (cid:96) := µ Q (cid:96) and when it is clear from the context we will vii in order to have shift-invariance properties, we will actually consider periodic variants of (2.39), seebelow. dentify a constant Λ > χ Q (cid:96) . As explained in the introduction,it is known that E (cid:104) W (cid:16) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) d (cid:17)(cid:105) ∼ (cid:96) d for d ≥ d ≥ d = 2 where an additional renormalization of the maps is needed in order to beable to pass to the limit. Hence, we assume from now on d = 2.The main stochastic ingredient is a control at every scale 1 (cid:28) (cid:96) < ∞ of (2.39). Thisestimate is a quite direct consequence of a result of Ambrosio, Stra and Trevisan [6]which we now recall.Since the results of [6] are not stated for the Poisson point process but rather fora deterministic number n → ∞ of uniform iid random variables X i on a given do-main Q (cid:96) , we need to introduce some more notation. For a given n ∈ N , we let theprobability P n on Γ be deﬁned as P n [ F ] := P [ F ∩ { µ (cid:96) ( Q (cid:96) ) = n } ] P [ µ (cid:96) ( Q (cid:96) ) = n ] , and let E n be the associated expectation. Note that by (2.37), we have p n := P [ µ (cid:96) ( Q (cid:96) ) = n ] = exp( − (cid:96) ) (cid:96) n n ! . (2.40)Equipped with this probability measure, µ (cid:96) can be identiﬁed with n uniformly iidrandom variables X i on Q (cid:96) . A simple rescaling shows that1 (cid:96) log n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) is independent of (cid:96) and [6, Theorem 1.1] states that,lim n →∞ (cid:96) log n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) = 14 π . (2.41)Arguing as in [6, Remark 4.7] and using the fact that the uniform measure on [0 , c > M ≥ n large enough uniformly viii in (cid:96) , P n (cid:20) (cid:96) log n W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) ≤ exp( − c M log n ) . (2.42)Let us now show how (2.41) and (2.42) translate into our setting. Proposition 2.7.

Let µ be a Poisson point process on R . Then, lim (cid:96) →∞ (cid:96) log (cid:96) E (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) = 12 π (2.43) viii Notice that the left-hand side of (2.42) actually does not depend on (cid:96) . nd there exists a universal constant c independent of (cid:96) and M such that for (cid:96) ≥ , M ≥ , P (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) ≤ exp( − cM log (cid:96) ) . (2.44) Proof.

We ﬁrst start by noting that by Cram´er-Chernoﬀ’s bounds for the Poissondistribution with intensity (cid:96) (see [12]), there exists a constant c such that P (cid:20) µ ( Q (cid:96) ) (cid:96) / ∈ (cid:20) , (cid:21)(cid:21) ≤ exp( − c(cid:96) ) , (2.45)and for M (cid:29) P (cid:20) µ ( Q (cid:96) ) (cid:96) ≥ M (cid:21) ≤ exp( − c(cid:96) M ) . (2.46)Let us now prove (2.43). Recall p n from (2.40). By deﬁnition of E n we have1 (cid:96) log (cid:96) E (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) = 1 (cid:96) log (cid:96) (cid:88) n/ ∈ [ (cid:96) / , (cid:96) ] p n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) + 1 (cid:96) log (cid:96) (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) . (2.47)Using the crude transport estimate W (cid:16) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:17) ≤ (cid:96) µ ( Q (cid:96) ) together with (2.40)we can estimate the ﬁrst term as1 (cid:96) log (cid:96) (cid:88) n/ ∈ [ (cid:96) / , (cid:96) ] p n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) ≤ (cid:96) (cid:88) n/ ∈ [ (cid:96) / , (cid:96) ] np n = 1log (cid:96) (cid:88) n/ ∈ [ (cid:96) / , (cid:96) ] exp( − (cid:96) ) (cid:96) n ( n − (cid:46) (cid:96) log (cid:96) P (cid:20) µ ( Q (cid:96) ) (cid:96) / ∈ [ 12 , (cid:21) (2.45) (cid:46) (cid:96) exp( − c(cid:96) )log (cid:96) , which goes to zero as (cid:96) → ∞ . The second term in (2.47) can be rewritten as1 (cid:96) log (cid:96) (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21) = (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n log n log (cid:96) (cid:18) (cid:96) log n E n (cid:20) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19)(cid:21)(cid:19) . Since by (2.41), (cid:96) log n E n (cid:104) W (cid:16) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:17)(cid:105) converges uniformly in (cid:96) to π , it is enoughto show that lim (cid:96) →∞ (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n log n log (cid:96) = 2 . his is a simple consequence of (2.45) and the fact that2 log (cid:96) − log 2log (cid:96) (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n ≤ (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n log n log (cid:96) ≤ (cid:96) + log 2log (cid:96) (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n . We now turn to (2.44). For 1 ≤ M (cid:46) (cid:96) log (cid:96) , by deﬁnition of P n P (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) = (cid:88) n/ ∈ [ (cid:96) / , (cid:96) ] p n P n (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) + (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n P n (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) ≤ P (cid:20) µ ( Q (cid:96) ) (cid:96) / ∈ (cid:20) , (cid:21)(cid:21) + (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n P n (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) (2.45) ≤ exp( − c(cid:96) ) + (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n P n (cid:20) (cid:96) log n W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ log (cid:96) (cid:96) + 2 M (cid:21) (2.42) ≤ exp( − c(cid:96) ) + (cid:88) n ∈ [ (cid:96) / , (cid:96) ] p n exp( − c M log n ) ≤ exp( − c(cid:96) ) + exp( − c M log (cid:96) ) ≤ exp( − c M log (cid:96) ) , while for M (cid:29) (cid:96) log (cid:96) , using once more the estimate W (cid:16) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:17) ≤ (cid:96) µ ( Q (cid:96) ) togetherwith (2.46), we obtain P (cid:20) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≥ M (cid:21) ≤ P (cid:20) µ ( Q (cid:96) ) (cid:96) ≥ M log (cid:96)(cid:96) (cid:21) ≤ exp( − cM log (cid:96) ) . By a Borel-Cantelli argument we can now strengthen (2.44) into a supremum bound.

Theorem 2.8.

Let µ be a Poisson point process on R . Then, there exist a universalconstant c and a random variable ix r ∗ = r ∗ ( µ ) ≥ with E (cid:104) exp (cid:16) cr ∗ log(2 r ∗ ) (cid:17)(cid:105) < ∞ suchthat for every dyadic (cid:96) with r ∗ ≤ (cid:96) , (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ log (cid:16) (cid:96)r ∗ (cid:17)(cid:16) (cid:96)r ∗ (cid:17) . (2.48) ix notice that we keep implicit here the dependence on µ roof. We ﬁrst prove that there exist a constant ¯ c > E [exp(¯ c Θ)] < ∞ such that for all dyadic (cid:96) with (cid:96) (cid:29) (cid:96) log (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ Θ . (2.49)For k ≥

1, let (cid:96) k := 2 k and putΘ k := 1 (cid:96) k log (cid:96) k W (cid:18) µ (cid:96) k , µ ( Q (cid:96) k ) (cid:96) k χ Q (cid:96)k (cid:19) , Θ := sup k ≥ Θ k . (2.50)We claim that the exponential moments of Θ k given by Proposition 2.7 translate intoexponential moments for Θ. Indeed ﬁx 1 (cid:29) ¯ c >

0. Then, we estimate E [exp(¯ c Θ)] ≤ exp(¯ c ) + (cid:88) M ∈ N P [Θ ≥ M ] exp(¯ c ( M + 1)) ≤ exp(¯ c ) + (cid:88) M ∈ N exp(¯ c ( M + 1)) (cid:88) k ≥ P [Θ k ≥ M ] (2.44) ≤ exp(¯ c ) + exp(¯ c ) (cid:88) k ≥ (cid:88) M ∈ N exp( − M ( c log (cid:96) k − ¯ c )) (cid:96) k =2 k & ¯ c (cid:28) (cid:46) exp(¯ c ) + exp(¯ c ) (cid:88) k ≥ exp( − ck ) < ∞ . Hence, Θ has exponential moments and (2.49) is satisﬁed for every large enoughdyadic (cid:96) .Deﬁne r ∗ ≥ r ∗ log (2 r ∗ ) = Θlog 2 , (2.51)which has a solution since r ∗ (cid:55)→ r ∗ / log (2 r ∗ ) is monotone on ( e/ , ∞ ). Since (cid:96) (cid:55)→ log (cid:16) (cid:96)r ∗ (cid:17) log (cid:96) is an increasing function, we have for (cid:96) ≥ r ∗ ,log (cid:96) ≤ log(2 r ∗ )log 2 log (cid:18) (cid:96)r ∗ (cid:19) which together with (2.51) gives for every dyadic (cid:96) ≥ r ∗ ,Θ log (cid:96) ≤ r ∗ log (cid:18) (cid:96)r ∗ (cid:19) , from which we see that (2.49) implies (2.48).We remark that r ∗ inherits all stationarity properties of the Poisson process as ameasurable function of the Poisson process (similarly for r ∗ ,L in Theorem 2.10). Wewill not explicitly mention this in the sequel. .4.2 The periodic problem Since our aim is to construct a coupling on R between Lebesgue and Poisson whichkeeps some of the shift-invariance properties of the Poisson point process, it is moreconvenient to work for ﬁnite-size cubes also with a shift-invariant point process. For L > Q L − periodic Poisson point process (which can be identiﬁedwith the Poisson point process on the ﬂat torus of size L ). For µ ∈ Γ, we let µ per L bethe Q L − periodic extension of µ Q L and then P L := µ per L P . (2.52)We denote by E L the expectation with respect to P L . We then call ( µ, P L ) (or simply µ when it is clear from the context) a Q L − periodic Poisson point process (or Poissonpoint process on the torus). Since P is invariant under θ , so is P L . Notice that for (cid:96) ≤ L the restriction of a Q L − periodic Poisson point process to Q (cid:96) is a Poisson pointprocess on Q (cid:96) in the sense of Section 2.4.1.For µ ∈ Γ and

L >

1, our main focus will be to understand at every scale 1 (cid:28) (cid:96) ≤ L the structure of the optimal transport map T µ,L on the torus between µ ( Q L ) L and µ ,i.e. T µ,L is the unique minimizer of (2.24). We will often identify T µ,L with ∇ ψ µ,L where ψ µ,L is the convex potential given in (2.25). When it is clear from the context,we drop the dependence of T µ,L and ψ µ,L on either µ , L or both. Uniqueness of theoptimal transport map solving (2.24) implies that T µ is covariant in the sense that T µ ( x + z ) = T θ z µ ( x ) + z x, z ∈ R . (2.53)For future reference, let us prove a corresponding stationarity property of the poten-tials. Lemma 2.9.

Let µ be a Q L − periodic Poisson point process and let T µ = ∇ ψ µ be theoptimal transport map between µ ( Q L ) L and µ on the torus. Then, for z ∈ R , ψ θ z µ ( x ) = ψ θ z µ (0) − ψ µ ( z ) + ψ µ ( x + z ) − z · x ∀ x ∈ R (2.54) and if ψ ∗ is the convex conjugate of ψ , ψ ∗ θ z µ ( y ) = ψ ∗ θ z µ (0) − ψ ∗ µ ( z ) + ψ ∗ µ ( y + z ) − z · y ∀ y ∈ R . (2.55) As a consequence, if we let for h ∈ R , D h ψ ( x ) := ψ ( x + h ) + ψ ( x − h ) − ψ ( x ) , D h ψ θ z µ ( x ) = D h ψ µ ( x + z ) and D h ψ ∗ θ z µ ( y ) = D h ψ ∗ µ ( y + z ) . (2.56) Proof.

Equation (2.54) is a direct consequence of (2.53) so that we just need to provethat it implies (2.55). By deﬁnition, ψ ∗ θ z µ ( y ) = sup x [ x · y − ψ θ z µ ( x )] (2.54) = sup x [ x · y − ψ θ z µ (0) + ψ µ ( z ) − ψ µ ( x + z ) + z · x ]= − ψ θ z µ (0) + ψ µ ( z ) − y · z − | z | + sup x [( x + z ) · ( y + z ) − ψ µ ( x + z )]= − ψ θ z µ (0) + ψ µ ( z ) − y · z − | z | + ψ ∗ µ ( y + z ) . pplying this to y = 0, we obtain ψ ∗ θ z µ (0) − ψ ∗ µ ( z ) = − ψ θ z µ (0) + ψ µ ( z ) − | z | , so that (2.55) follows.Let us ﬁnally translate the result of Theorem 2.8 into the periodic setting. In par-ticular, the following result contains Theorem 1.3. Theorem 2.10.

There exists a universal constant c such that for L = 2 k , k ∈ N ,dyadic and µ a Q L − periodic Poisson point process, there exists a family of randomvariables r ∗ ,L ≥ with sup L E L (cid:20) exp (cid:18) cr ∗ ,L log(2 r ∗ ,L ) (cid:19)(cid:21) < ∞ such that if r ∗ ,L ≤ L wehave µ ( Q L ) L ∈ (cid:20) , (cid:21) , Spt µ ∩ B r ∗ ,L (cid:54) = ∅ (2.57) and for every dyadic (cid:96) with r ∗ ,L ≤ (cid:96) ≤ L , (cid:96) W , per (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) . (2.58) Proof.

Let ﬁrst (cid:101) r ∗ ,L = (cid:101) r ∗ ,L ( µ ) be deﬁned by (cid:101) r ∗ ,L := inf (cid:40) r : 1 (cid:96) W (cid:18) µ (cid:96) , µ ( Q (cid:96) ) (cid:96) (cid:19) ≤ log (cid:0) (cid:96)r (cid:1)(cid:0) (cid:96)r (cid:1) , for all dyadic (cid:96) with 2 r ≤ (cid:96) ≤ L (cid:41) , where we take the convention that (cid:101) r ∗ ,L = L/ r ∗ ,L = r ∗ ,L ( µ ) by r ∗ ,L := (cid:40) L if µ ( Q L ) L / ∈ (cid:2) , (cid:3) max( (cid:101) r ∗ ,L , min Spt µ | y | ) otherwise . Since P L (cid:104) µ ( Q L ) L / ∈ (cid:2) , (cid:3)(cid:105) ≤ exp( − c L ) for some universal c > r ≤ L/ P L (Spt µ ∩ B r = ∅ ) = exp( − r ), it is enough to prove that (cid:101) r ∗ ,L satisﬁessup L E L (cid:20) exp (cid:18) c (cid:101) r ∗ ,L log(2 (cid:101) r ∗ ,L ) (cid:19)(cid:21) < ∞ for some c >

0. Now this follows directly from(2.48) and the fact that for general potentially non-periodic µ ∈ Γ, (cid:101) r ∗ ,L ( µ per L ) ≤ r ∗ ( µ ).The ﬁrst inequality in (2.58) follows from the fact that W , per ≤ W . Remark 2.11.

To avoid confusion between periodic and Euclidean objects, we wouldlike to stress a few things which will be important in Section 4. While the map T µ is deﬁned through the periodic optimal transport problem, estimate (2.58) gives abound on the Euclidean Wasserstein distance between the restrictions µ (cid:96) and thecorresponding multiples of the Lebesgue measure on Q (cid:96) . The reason for the twodiﬀerent transport problems is that on the one hand we want to work with a map whichhas good stationarity properties and on the other hand, for the iteration argument elow, it is more natural to have bounds on the Euclidean Wasserstein distancesbetween µ (cid:96) and µ ( Q (cid:96) ) (cid:96) χ Q (cid:96) .Presently, conditions (2.57) come out of the blue but they will be very useful in Section4. Similarly, the ﬁrst inequality in (2.58) will allow us to initialize the iterationargument in Theorem 1.1. In this section we prove our main regularity result which states that for every di-mension d ≥

2, given an optimal transport map T between a bounded set Ω and ameasure µ , if at some scale R > E ( µ, T, R ) := 1 R d +2 (cid:90) B R | T − x | (3.1)and the local squared Wasserstein distance of µ in O ⊃ B R to a constant D ( µ, O, R ) := 1 R d +2 W (cid:18) µ O, µ ( O ) | O | χ O (cid:19) (3.2)are small, then on B R , T is quantitatively close (in terms of E and D ) to an harmonicgradient ﬁeld. This is similar to [23, Proposition 4.6] with the major diﬀerence thathere the measure µ is arbitrary and can be in particular singular. Let us point outthat we allow for O (cid:54) = B R only because of the application we have in mind to theoptimal matching problem where we have good control on cubes instead of balls (seeTheorem 2.8 and the proof of Theorem 1.1).By scaling we will mostly work here with R = 1 and will use the notation E for E ( µ, T,

1) and similarly for D . The global strategy is similar to the one used in [23]and goes through an estimate at the Eulerian level (2.13). However, as opposed to[23], if ( ρ, j ) is the minimizer of (2.13) it does not satisfy a global L ∞ bound (seeLemma 2.4). We will thus need to introduce a terminal layer. In [6], regularizationby the heat ﬂow is used as an alternative approach to tackle this issue.For τ >

0, let ρ := (cid:90) − τ ρ t dt. By (2.20), we have ρ (cid:46) τ − ( d − in B . (3.3)As in [23], we would like to use the ﬂux of j as boundary data for the solution of thePoisson equation we will consider. This requires choosing a good radius R for which j satisﬁes good estimates on ∂B R . In our setting, this is much more complex than in[23] and is the purpose of the next section. Let us point out that since the estimateswe want to use are on the L scale, we would need that the ﬂux of j through ∂B R is well controlled in L . Since this is in general not the case, we will also need toreplace this ﬂux by a more regular one (see Lemma 3.4 below). R − f R + X ( t R − ) X ( t R + ) ρt = 1 t = 1 − τt = 0 ∂B R ∂B R ρ = µ ρ = 1 Figure 1: The deﬁnition of f R ± . Let X ( x, t ) = T t ( x ) be the solution of the time-dependent version of optimal transport(2.12). Let us recall that the corresponding trajectories t → T t ( x ) are straightsegments and that we often drop the dependence in x when it is not necessary tospecify it. For R >

0, and a given trajectory X passing through B R we let (seeFigure 1) t R − := min { t ∈ [0 ,

1] : X ( t ) ∈ B R } and t R + := max { t ∈ [0 ,

1] : X ( t ) ∈ B R } be the entrance and exit times. If X ( t ) does not intersect B R , we set t R − = 1and t R + = 0. Notice that for t R − < X ( t R − ) ∈ ∂B R and likewise t R + > X ( t R + ) ∈ ∂B R . We now deﬁne the ﬂux f R of j through ∂B R (formally f R = j · ν B R , where ν B R denotes the outward normal to B R ) by its action on functions ζ ∈ C c ( R d × [0 , (cid:90) R d (cid:90) ζdf R := (cid:90) Ω χ ≤ t R −

Let ( ρ, j ) be a minimizer of (2.13) and let f R be deﬁned by (3.4) .Then, for every ζ ∈ C c ( R d × [0 , , (cid:90) B R (cid:90) ∂ t ζρ + ∇ ζ · j = (cid:90) B R ζ dµ − (cid:90) B R ζ + (cid:90) R d (cid:90) ζdf R . (3.5) As a consequence, ( ρ, j ) is a local minimizer of (2.13) in the sense that for every (cid:101) ρ, (cid:101) j ) with Spt ( (cid:101) ρ, (cid:101) j ) ⊂ B R × [0 , and satisfying (3.5) , (cid:90) B R (cid:90) ρ | j | ≤ (cid:90) B R (cid:90) (cid:101) ρ | (cid:101) j | . (3.6) Proof.

Once (3.5) is established, local minimality of ( ρ, j ) follows from the fact that( ˆ ρ, ˆ j ) deﬁned as ( (cid:101) ρ, (cid:101) j ) in B R × [0 ,

1] and ( ρ, j ) outside, is admissible for (2.13). Wethus only need to prove that (3.5) holds. By (3.4), for every ζ ∈ C c ( R d × [0 , (cid:90) R d (cid:90) ζdf R = (cid:90) Ω χ ≤ t R −

The measure f R is absolutely continuous with respect to the measure H d − ∂B R ⊗ dt and for t ∈ (0 , , sup ∂B R | f Rt | (cid:46) E d +2 − t ) d . (3.7) Proof.

Let ζ ∈ C c ( R d × (0 , < r (cid:28) R , let η r be a smooth radialfunction such that η r ( x ) = 0 if | x | ≤ R − r , η r ( x ) = 1 for | x | ≥ R and sup |∇ η r | (cid:46) r − .Testing (3.5) with ζη r , we obtain since f R is supported on ∂B R × [0 , (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) R d (cid:90) ζdf R (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:90) B R \ B R − r (cid:90) | ∂ t ζ | ρ + |∇ ζ || j | + r − | ζ || j | (2.20) (cid:46) (cid:90) B R \ B R − r (cid:90) ( | ∂ t ζ | + E d +2 |∇ ζ | ) 1(1 − t ) d + r − E d +2 | ζ | (1 − t ) d . etting r →

0, we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) R d (cid:90) ζdf R (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:90) ∂B R (cid:90) E d +2 | ζ | (1 − t ) d , from which the claim follows.We deﬁne the outgoing and incoming ﬂuxes as (see Figure 1) (cid:90) R d (cid:90) ζdf R + := (cid:90) Ω χ ≤ t R −

1, we deﬁne the cumulatedﬂuxes (cid:90) R d ζdf R + := (cid:90) Ω χ ≤ t R −

There holds f R + ⊥ f R − and sup ∂B R f R ± (cid:46) E d +2 τ − ( d − . (3.11) Proof.

Let us prove that f R + ⊥ f R − . Estimate (3.11) will then follow from (3.7).To this end we consider the space-time points on the cylinder through which particlesexit A := { ( y, t ) ∈ ∂B R × (0 ,

1) : ∃ x ∈ Ω such that t R − < t R + and ( X ( t R + ) , t R + ) = ( y, t ) } and the original positions of particles that enter at the same space-time where anotherparticle exits B := (cid:8) x ∈ Ω : 0 < t R − < t R + , ∃ (cid:101) x ∈ Ω with (cid:101) t R − < (cid:101) t R + = t R − and X ( x, t R − ) = X ( (cid:101) x, (cid:101) t R + ) (cid:9) . We claim that B = ∅ .Recall (cf. Remark 2.5) that T is given as a measurable selection of the subgradientof a convex function. In particular, we have ( T ( x ) − T ( (cid:101) x )) · ( x − (cid:101) x ) ≥ x, (cid:101) x .Hence, if x is such that 0 < t R − < t R + and (cid:101) x such that (cid:101) t R − < (cid:101) t R + = t R − we cannot have ( x, t R − ) = T t R − ( x ) = T t R − ( (cid:101) x ) = X ( (cid:101) x, (cid:101) t R + ) and thus B = ∅ .Now since f R + ( A c ) (3.8) = (cid:90) Ω χ ≤ t R −

Assume that E + D (cid:28) , then there exists R ∈ ( , ) such that (cid:90) ∂B R (cid:90) − τ ( f R ) (cid:46) τ − d E (3.14) and there exist densities ρ R ± and ρ R, lay + on ∂B R such that (cid:90) ∂B R ( ρ R ± ) (cid:46) E and W ∂B ( ρ R ± , f R ± ) (cid:46) E d +3 d +2 , (3.15) and (cid:90) ∂B R ( ρ R, lay + ) (cid:46) τ E + D and W ∂B ( ρ R, lay + , f R, lay + ) (cid:46) τ E d +3 d +2 + τ E d +2 D. (3.16) Moreover, sup ∂B R ρ R ± (cid:46) E d +2 . (3.17) Proof.

Let us start by (3.14). For this, given ζ ∈ C c ( R d × (0 , − τ )), integrating(3.5) in R ∈ ( , ), we obtain (cid:90) (cid:90) R d (cid:90) − τ ζdf R = (cid:90) (cid:90) B R (cid:90) − τ ∂ t ζρ + ∇ ζ · j. B R ρ R + Π X ( t R + ) t = 0 t = 1 − τt = 1 Figure 2: The deﬁnition of ρ R + . Letting ω ( x ) := (cid:82) χ B R ( x ) dR and using Fubini, we obtain (cid:90) (cid:90) R d (cid:90) − τ ζdf R = (cid:90) R d (cid:90) − τ ω ( ∂ t ζρ + ∇ ζ · j )= (cid:90) R d (cid:90) − τ ζ ∇ ω · j, where in the second line we used the fact that ( ρ, j ) satisﬁes the continuity equationon R d × (0 , ρ given by (2.20) and by (2.21), we thus obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) (cid:90) R d (cid:90) − τ ζdf R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:90) B (cid:90) − τ ρζ  (cid:90) B (cid:90) − τ ρ | j |  (cid:46) τ − d E (cid:90) B (cid:90) − τ ζ  , from which we obtain by duality (cid:90) (cid:90) ∂B R (cid:90) − τ ( f R ) (cid:46) τ − d E. (3.18)We now turn to (3.15) and (3.17). Notice that by (2.27), it is enough to prove (3.15)for W instead of W ∂B . Let ρ R ± be the measures supported on ∂B R such that for ζ ∈ C c ( R d ) (see Figure 2) ∂B R ζdρ R + := (cid:90) Ω χ t R −

1, we have X (1) ∈ B ⊂ O and therefore, (cid:90) R d × R d ζ ( x ) d (cid:98) Π = (cid:90) Ω × O × O χ t R − < − τ

Assume that E + D (cid:28) , then there exists R ∈ ( , ) such that theconclusions of Lemma 3.4 hold and W (cid:18) ρ − τ B R , ρ − τ ( B R ) | B R | χ B R (cid:19) + W (cid:18) µ (cid:48) R , µ (cid:48) R ( B R ) | B R | χ B R (cid:19) (cid:46) τ E + D. (3.27) Proof.

We are only going to show that (cid:90) W (cid:18) ρ − τ B R , ρ − τ ( B R ) | B R | χ B R (cid:19) (cid:46) τ E + D, (3.28)since the estimate for W (cid:16) µ (cid:48) R , µ (cid:48) R ( B R ) | B R | χ B R (cid:17) is similarly obtained. For notationalsimplicity, in this proof we will often drop the R dependence in our notation. PutΛ := ρ − τ ( B R ) | B R | and Γ := µ ( O ) | O | . We will not distinguish between Λ and the functionΛ χ B R and similarly for Γ. Since E (cid:28)

1, (2.18) and (2.19) imply that Λ ∼ Γ ∼ X be the optimal trajectories for W ( χ Ω , µ ) and Π be the optimal coupling for D = W ( µ O, Γ χ O ). Let us recall that Π is the measure deﬁned on Ω × O by (cid:90) Ω × O ζd Π := (cid:90) Ω χ O ( T ( x )) ζ ( x, T ( x ))and let (cid:101) µ ≤ µ be the measure deﬁned by (see Figure 4) (cid:90) R d ζd (cid:101) µ := (cid:90) Ω χ | X (1 − τ ) |≤ R ( X ) ζ ( X (1)) = (cid:90) Ω × O χ | X (1 − τ ) |≤ R ( X ) ζ ( y ) d Π , or in words, (cid:101) µ is the part of µ which originates from ρ − τ B R along X . Notice that (cid:101) µ B R = µ (cid:48) R (recall (3.26)). Recall that if Π is the coupling obtained by the GluingLemma applied to Π and Π , we can then deﬁne g ≤ Γ χ O by (cid:90) R d ζdg := (cid:90) Ω × O × O χ | X (1 − τ ) |≤ R ( X ) ζ ( z ) d Π( x, y, z ) , which is the part of Γ χ O which originates from (cid:101) µ through Π . We then project theparts of (cid:101) µ and g outside B R onto ∂B R : (cid:90) R d ζdf (cid:101) µ := (cid:90) R d χ B cR ( y ) ζ (cid:18) R y | y | (cid:19) d (cid:101) µ = (cid:90) Ω × O χ | X (1 − τ ) |≤ R ( X ) χ B cR ( y ) ζ (cid:18) R y | y | (cid:19) d Π B R O Γ ρ − τ t = 0 t = 1 − τt = 1 g f g f (cid:101) µ (cid:101) µ Figure 4: The deﬁnition of (cid:101) µ , g , f (cid:101) µ and f g .41 nd (cid:90) R d ζdf g := (cid:90) O χ B cR ( z ) ζ (cid:18) R z | z | (cid:19) dg = (cid:90) Ω × O × O χ | X (1 − τ ) |≤ R ( X ) χ B cR ( z ) ζ (cid:18) R z | z | (cid:19) d Π . Since g ≤ Γ (cid:46) x (cid:90) (cid:90) ∂B R f g (cid:46) (cid:90) (cid:90) Ω × O × O | R − | z || χ | X (1 − τ ) |≤ R ( X ) χ B cR ( z ) d Π (3.22) ≤ (cid:90) Ω × O × O | X (1 − τ ) − z | χ | X (0) | < ( X ) d Π (cid:46) (cid:90) Ω × O × O | X (1 − τ ) − y | χ | X (0) | < ( X ) d Π + (cid:90) Ω × O × O | z − y | d Π ≤ τ E + D. (3.29)We then let ˆ µ := (cid:101) µ B R + f (cid:101) µ and ˆ g := g B R + f g . Since projecting from outside B R reduces the distances W ( ρ − τ B R , ˆ µ ) ≤ W ( ρ − τ B R , (cid:101) µ ) ≤ (cid:90) Ω χ | X (0) | < | X (1 − τ ) − X (1) | = τ (cid:90) B | T − x | = τ E. For the same reason, we also have W (ˆ µ, ˆ g ) ≤ W ( (cid:101) µ, g ) ≤ D. Therefore by triangle inequality W ( ρ − τ B R , Λ) (cid:46) W ( ρ − τ B R , ˆ µ )+ W (ˆ µ, ˆ g )+ W (ˆ g, Λ) (cid:46) τ E + D + W (ˆ g, Λ) . (3.30)We are thus left with the estimate of W (ˆ g, Λ). For this we ﬁrst claim that W (ˆ g, Λ) (cid:46) W (cid:18)

12 (ˆ g + Λ) , Λ (cid:19) . Indeed, by triangle inequality and monotonicity of the transport cost W (Λ , s ) ≤ W (cid:18) Λ ,

12 (Λ + s ) (cid:19) + W (cid:18)

12 (Λ + s ) , s (cid:19) ≤ W (cid:18) Λ ,

12 (Λ + s ) (cid:19) + W (cid:18)

12 Λ , s (cid:19) = W (cid:18) Λ ,

12 (Λ + s ) (cid:19) + 1 √ W (Λ , s ) . x notice that we cannot assert the same thing for f (cid:101) µ ow let ϕ g be the solution of  ∆ ϕ g = Λ − g in B R∂ϕ g ∂ν = f g on ∂B R , (3.31)with (cid:82) B R ϕ g = 0. Notice that since by deﬁnition of Λ and g , Λ = | B R | g ( R d ) so thatby deﬁnition of f g , (cid:90) B R (Λ − g ) = g ( B cR ) = (cid:90) ∂B R f g , so that this equation is indeed solvable. Let (cid:101) ρ := (1 − t )Λ + t

12 (Λ + ˆ g ) and (cid:101) j := 12 ∇ ϕ g . The pair ( (cid:101) ρ, (cid:101) j ) is admissible for the Benamou-Brenier formulation (2.13) of W (Λ , (Λ+ˆ g )) since (3.31) implies in a distributional sense ∇ · (cid:101) j = 12 (Λ − g − f g ) = 12 (Λ − ˆ g ) in R d , where we think of (cid:101) j as being extended by zero from B R to R d . Hence, as desired, ∂ t (cid:101) ρ + ∇ · (cid:101) j = 0 in R d × (0 , (cid:101) ρ ≥

12 Λ , we thus have W (Λ ,

12 (Λ + ˆ g )) (cid:46) (cid:90) B R |∇ ϕ g | . (3.32)Let g − := (Γ − g ) χ B R so that by deﬁnition of g , (cid:90) B R ζdg − = (cid:90) Ω × O × O χ | X (1 − τ ) | >R ( x ) χ B R ( z ) ζ ( z ) d Π . Thanks to the L ∞ bound (2.18) on the transport, we have that Spt g − ⊂ B R \ B R/ .We can rewrite ∆ ϕ g = Λ − g = g − − (Γ − Λ) so that by Lemma 2.2, (cid:90) B R |∇ ϕ g | (cid:46) (cid:90) ∂B R f g + (cid:90) B R ( R − | x | ) dg − . Arguing as for (3.21), we get (cid:90) (cid:90) B R ( R − | x | ) dg − (cid:46) D, so that using (3.29), (3.30) and (3.32) we obtain (3.28). From this we see that we mayﬁnd R ∈ ( , ) such that both the conclusions of Lemma 3.4 and (3.27) hold. .2 The main estimate To ease notation, we shall now assume that R = 1 and we will drop the index R .The main goal of this section is to prove Proposition 1.6 which states that for everyﬁxed τ (cid:28)

1, there exists a constant C ( τ ) > E and D are small enough,then there exists an harmonic gradient ﬁeld ∇ ϕ in B such that (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) τ E + C ( τ ) D. (3.33)From this Eulerian estimate, Proposition 1.5 which is the Lagrangian counterpart, isreadily obtained. This in turn leads to the proof of Theorem 1.4, which is one stepin a Campanato iteration scheme.We now proceed with the deﬁnition of ϕ . Recall ρ ± from Lemma 3.4 and let ϕ bethe (unique) solution of  ∆ ϕ = | B | (cid:82) ∂B ( ρ + − ρ − ) in B ∂ϕ∂ν = ρ + − ρ − on ∂B , (3.34)such that (cid:82) B ϕ = 0. Notice that by (2.1), the H¨older inequality and (3.15) (cid:90) B |∇ ϕ | (cid:46) (cid:90) ∂B ρ + ρ − (cid:46) E. (3.35)Moreover, by Pohozaev, we also have (cid:90) ∂B |∇ ϕ | (cid:46) E. (3.36)The proof of (3.33) is divided into two parts. The ﬁrst is an almost orthogonalityproperty (see (3.37)) and the second is a construction of a competitor to estimate (cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | , see (3.52). We start with the almost orthogonality property. Proposition 3.6.

For every < τ (cid:28) , there exist constants ε ( τ ) > and C ( τ ) > such that if E + D ≤ ε ( τ ) , then letting ϕ be deﬁned via (3.34) , we have (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) (cid:18)(cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | (cid:19) + τ E + C ( τ ) D. (3.37) Proof. S tep 1. Before starting, let us point out that since in B the function ∇ ϕ issmooth, the measure ρ ∇ ϕ is well deﬁned. Furthermore, since clearly j − ρ ∇ ϕ (cid:28) ρ also the left-hand side of (3.37) is well deﬁned. We start by noting that (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | = (cid:90) B (cid:90) − τ ρ | j − ρ ∇ ϕ | + (cid:90) B (cid:90) − τ ρ | j − ρ ∇ ϕ | nd that since by harmonicity of ∇ ϕ , sup B |∇ ϕ | (cid:46) (cid:82) B |∇ ϕ | , (cid:90) B (cid:90) − τ ρ | j − ρ ∇ ϕ | (cid:46) (cid:90) B (cid:90) − τ ρ | j | + sup B |∇ ϕ | (cid:90) B (cid:90) − τ ρ (2.21)&(3.35) (cid:46) τ E. Therefore, (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) (cid:90) B (cid:90) − τ ρ | j − ρ ∇ ϕ | + τ E (3.38)and we are left with bounding the ﬁrst term on the right-hand side. Notice that sincein (0 , − τ ), ρ and j are bounded functions by (2.20), the right-hand side of (3.38)is well deﬁned in a pointwise sense. Recalling that ρ = (cid:82) − τ ρ , we may now compute (cid:90) B (cid:90) − τ ρ | j − ρ ∇ ϕ | = (cid:90) B (cid:90) − τ ρ | j | − (cid:90) B |∇ ϕ | (3.39) − (cid:90) B (cid:90) − τ (cid:18) j − − τ ∇ ϕ (cid:19) · ∇ ϕ + (cid:90) B ( ρ − |∇ ϕ | . Step 2.

In this step we show that (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:16) τ − d ( d − γ d ( τ ) (cid:17) d +2 E d +3 d +2 + τ E, (3.40)where γ d ( τ ) :=  d = 2 | log τ | for d = 3 τ − ( d − otherwise . (3.41)Let the boundary layer size r (cid:28) η be a smooth cut-oﬀfunction with χ B − r ≤ η ≤ χ B − r and |∇ η | (cid:46) r − . We split the integral: (cid:90) B ( ρ − |∇ ϕ | = (cid:90) B ( ρ − − η ) |∇ ϕ | + (cid:90) B ( ρ − η |∇ ϕ | . (3.42)The ﬁrst term may be estimated as follows (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − − η ) |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:90) B \ B − r | ρ − ||∇ ϕ | (cid:46) τ − ( d − (cid:90) B \ B − r |∇ ϕ | (cid:46) rτ − ( d − (cid:90) ∂B ρ + ρ − (3.15) (cid:46) rτ − ( d − E. (3.43) e now turn to the second term. By (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − (1 − τ )) η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) + τ (cid:90) B |∇ ϕ | (cid:46) (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − (1 − τ )) η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) + τ E, it is enough to estimate (cid:12)(cid:12)(cid:12)(cid:82) B ( ρ − (1 − τ )) η |∇ ϕ | (cid:12)(cid:12)(cid:12) . To this purpose we give an alter-native representation: since − (1 − τ − t ) η |∇ ϕ | ∈ C ∞ c ( B × [0 , t ∈ [1 − τ,

1] and test (3.5) with it to obtain (cid:90) B ¯ ρη |∇ ϕ | = (cid:90) B (cid:90) − τ ρ∂ t ( − (1 − τ − t ) η |∇ ϕ | )= (cid:90) B (cid:90) − τ (1 − τ − t ) j · ∇ ( η |∇ ϕ | ) + (cid:90) B (1 − τ ) η |∇ ϕ | . Therefore, (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − (1 − τ )) η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B (cid:90) − τ (1 − τ − t ) j · ∇ ( η |∇ ϕ | ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18)(cid:90) B (cid:90) − τ ρ | j | (cid:19) (cid:18)(cid:90) B (cid:90) − τ (1 − τ − t ) ρ |∇ ( η |∇ ϕ | ) | (cid:19) (2.20) (cid:46) E (cid:18)(cid:90) − τ − t ) d − (cid:90) B |∇ ( η |∇ ϕ | ) | (cid:19) (3.44) (cid:46) γ d ( τ ) E (cid:18)(cid:90) B |∇ ( η |∇ ϕ | ) | (cid:19) , where we recall that γ d is deﬁned in (3.41). By Leibniz rule and Cauchy-Schwarz wehave (cid:90) B |∇ ( η |∇ ϕ | ) | (cid:46) r (cid:90) B − r |∇ ϕ | + (cid:90) B − r |∇ ϕ | |∇ ϕ | (cid:46) r (cid:90) B − r |∇ ϕ | + r (cid:90) B − r |∇ ϕ | . By the mean value formula for ∇ ϕ , for every x ∈ B − r , |∇ ϕ | ( x ) (cid:46) r | B r | (cid:90) B r ( x ) |∇ ϕ | so that integrating, using Jensen inequality and Fubini, r (cid:90) B − r |∇ ϕ | (cid:46) r (cid:90) B − r |∇ ϕ | rom which the above estimate simpliﬁes to (cid:90) B |∇ ( η |∇ ϕ | ) | (cid:46) r (cid:90) B − r |∇ ϕ | . Let p = dd − . By the mean value formula for ∇ ϕ and Jensen’s inequality,sup B − r |∇ ϕ | (cid:46) (cid:18) r d (cid:90) B |∇ ϕ | p (cid:19) p (2.1) (cid:46) r − dp (cid:18)(cid:90) ∂B ρ + ρ − (cid:19) (cid:46) r − dp E . We then have 1 r (cid:90) B − r |∇ ϕ | ≤ r sup B − r |∇ ϕ | − p (cid:90) B |∇ ϕ | p (cid:46) r − (cid:16) r − dp E (cid:17) − p E p = r − d E . Collecting all the previous estimates we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) r − d γ d ( τ ) E + τ E and thus plugging this and (3.43) into (3.42), we get (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ( ρ − |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) rτ − ( d − E + r − d γ d ( τ ) E + τ E. Optimizing in r through r = (cid:0) τ d − γ d ( τ ) E (cid:1) d +2 and using γ d ( τ ) E (cid:28) τ − ( d − toensure that r (cid:28)

1, we obtain the aimed estimate (3.40).

Step 3.

We now estimate (cid:90) B (cid:90) − τ (cid:18) j − − τ ∇ ϕ (cid:19) · ∇ ϕ. For this we want to use (3.5) for ζ = χ (0 , − τ ) ϕ . Notice ﬁrst that since ρ , j and f ± (recall the deﬁnition (3.9)) are bounded densities in (0 , − τ ) (see Lemma 2.4and Lemma 3.3), by density we can apply (3.5) to ζ ∈ H ( B × (0 , ζ ⊂ B × [0 , − τ / ϕ δ ∈ C ( B ) be a molliﬁcation of ϕ so that by continuity of t → ρ t in W , 1 ε (cid:90) − τ + ε − τ (cid:90) B ϕ δ ρ t → (cid:90) B ϕ δ ρ − τ . (3.45)Then, apply (3.5) to η ε ( t ) ϕ δ ( x ) where for ε > η ε ( t ) =  t ∈ (0 , − τ ]1 − ε − ( t − (1 − τ )) for t ∈ (1 − τ, − τ + ε )0 for t ≥ − τ + ε o obtain for ε → (cid:90) B (cid:90) − τ ∇ ϕ δ · j = (cid:90) B ϕ δ ρ − τ − (cid:90) B ϕ δ + (cid:90) R d (cid:90) − τ ϕ δ df. Letting δ → ϕ = constant and (cid:82) B ϕ = 0 and recalling the deﬁnitionof ϕ in (3.34) and (3.10), we thus obtain (cid:90) B (cid:90) − τ (cid:18) j − − τ ∇ ϕ (cid:19) ·∇ ϕ = (cid:90) B ϕρ − τ + (cid:90) ∂B ϕ [( f + − ρ + ) − ( f − − ρ − )] . (3.46)Let us estimate the ﬁrst term. Let ( (cid:101) ρ, (cid:101) j ) be given by the Benamou-Brenier theoremand such that (cid:90) B (cid:90) (cid:101) ρ | (cid:101) j | = W (cid:18) ρ − τ ( B ) | B | χ B , ρ − τ B (cid:19) (3.27) (cid:46) τ E + D. (3.47)If (cid:101) T is the optimal transport map between ρ − τ ( B ) | B | χ B and ρ − τ B , (cid:101) ρ − d t = (cid:18) ρ − τ ( B ) | B | (cid:19) − d det d ∇ (cid:101) T t ( (cid:101) T − t ) (2.22) ≥ (cid:18) ρ − τ ( B ) | B | (cid:19) − d (cid:16) (1 − t ) + t det d ∇ (cid:101) T ( (cid:101) T − t ) (cid:17) = (1 − t ) (cid:18) ρ − τ ( B ) | B | (cid:19) − d + tρ − d − τ ( (cid:101) T − t ) (2.20) ≥ (1 − t ) (cid:18) ρ − τ ( B ) | B | (cid:19) − d + tτ and thus since ρ − τ ( B ) | B | ∼ (cid:101) ρ (cid:46) ((1 − t ) + tτ ) − d . (3.48)We then have because of (cid:82) B ϕ = 0, (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ϕρ − τ (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B (cid:90) ∇ ϕ · (cid:101) j (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18)(cid:90) B (cid:90) (cid:101) ρ |∇ ϕ | (cid:19) (cid:18)(cid:90) B (cid:90) (cid:101) ρ | (cid:101) j | (cid:19) (3.47) (cid:46) (cid:18)(cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) |∇ ϕ | + ρ − τ ( B ) | B | (cid:90) B |∇ ϕ | (cid:19) (cid:0) τ E + D (cid:1) (3.35) (cid:46) (cid:18)(cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) |∇ ϕ | + E (cid:19) (cid:0) τ E + D (cid:1) (cid:46) τ (cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) |∇ ϕ | + τ E + τ − D, (3.49) here in the last line we used Young’s inequality. The term (cid:82) B (cid:82) (cid:16)(cid:101) ρ − ρ − τ ( B ) | B | (cid:17) |∇ ϕ | is estimated as in Step 2. Indeed, choosing for r (cid:28) η with χ B − r ≤ η ≤ χ B − r , we obtain as in (3.43) that (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) (1 − η ) |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) rτ − ( d − E. Using that (cid:101) ρ − ρ − τ ( B ) | B | = (cid:90) (1 − t ) ∂ t (cid:101) ρ, we obtain as in (3.44) (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) η |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:18)(cid:90) B (cid:90) (cid:101) ρ | (cid:101) j | (cid:19) (cid:18)(cid:90) B (cid:90) (1 − t ) ((1 − t ) + tτ ) d |∇ ( η |∇ ϕ | ) | (cid:19) (3.47) (cid:46) γ d ( τ )( τ E + D ) (cid:18)(cid:90) B |∇ ( η |∇ ϕ | ) | (cid:19) (cid:46) r − d γ d ( τ )( τ E + D ) E. Optimizing in r , we get (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B (cid:90) (cid:18)(cid:101) ρ − ρ − τ ( B ) | B | (cid:19) |∇ ϕ | (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:16) τ − d ( d − γ d ( τ ) (cid:17) d +2 ( τ E + D ) d +2 E. Plugging this into (3.49) we obtain for some C ( τ ) (cid:29) (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) B ϕρ − τ (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) (cid:16) C ( τ )( τ E + D ) d +2 + τ (cid:17) E + τ − D. (3.50)We now turn to the second term in (3.46). It is enough to bound (cid:90) ∂B ϕ ( f + − ρ + )since the other term is treated analogously. Let ( ˆ ρ, ˆ j ) be the minimizer of (2.29), i.e. (cid:90) ∂B (cid:90) ρ | ˆ j | = W ∂B ( ρ + , f + ) (3.15) (cid:46) E d +3 d +2 . Arguing as for (3.48) but using (2.30) and (2.31) together with (3.17) and (3.11), weobtain ˆ ρ − d − (cid:38) ((1 − t ) + tτ ) E − d − d +2) so that (cid:90) ˆ ρ (cid:46) τ − ( d − E d +2 , ith the convention that for d = 2, τ − ( d − = | log τ | . By integration by parts wethen have (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) ∂B ϕ ( f + − ρ + ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) ∂B (cid:90) ∇ bdr ϕ · ˆ j (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:18)(cid:90) ∂B (cid:90) ˆ ρ |∇ ϕ | (cid:19) (cid:18)(cid:90) ∂B (cid:90) ρ | ˆ j | (cid:19) (3.36) (cid:46) (cid:16) τ − ( d − E d +2 (cid:17) E (cid:16) E d +3 d +2 (cid:17) (cid:46) τ − d − E d +3 d +2 , with the convention that for d = 2, τ − d − = | log τ | . This estimate together with(3.50) yields (cid:90) B (cid:90) − τ (cid:18) j − − τ ∇ ϕ (cid:19) · ∇ ϕ (cid:46) (cid:16) C ( τ ) (cid:104) ( τ E + D ) d +2 + E d +2 (cid:105) + τ (cid:17) E + τ − D. (3.51)Putting together (3.39), (3.40) and (3.51), we conclude that (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | − (cid:18)(cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | (cid:19) (cid:46) (cid:16) C ( τ ) (cid:104) ( τ E + D ) d +2 + E d +2 (cid:105) + τ (cid:17) E + τ − D, so that (3.37) follows if E + D ≤ ε ( τ ) for some ε ( τ ) small enough.We may now use the minimality of ( ρ, j ) to estimate (cid:82) B (cid:82)

10 1 ρ | j | − (cid:82) B |∇ ϕ | . Proposition 3.7.

For every < τ (cid:28) , there exist ε ( τ ) and C ( τ ) such that if E + D ≤ ε ( τ ) , then letting ϕ be deﬁned via (3.34) , we have (cid:90) B (cid:90) ρ | j | − (cid:90) B |∇ ϕ | (cid:46) τ E + C ( τ ) D. (3.52) Proof.

Recall the measure f from (3.4). We are going to construct a competitor( (cid:101) ρ, (cid:101) j ) supported in B × [0 ,

We start by constructing and estimating ( ρ bulk , j bulk ). The main estimate ofthis step is (cid:90) B (cid:90) − τ (cid:12)(cid:12)(cid:12) | j bulk | − ρ in |∇ ϕ | (cid:12)(cid:12)(cid:12) (cid:46) τ E + (cid:16) τ − d E (cid:17) d +2 d +1 + D. (3.57)Note that the ﬁrst right-hand side term of (3.56) involves ρ lay through ρ in = ρ bulk + ρ lay . However, for its estimate in this substep we only need that for t ∈ (0 , − τ ), ρ lay (cid:46) (cid:0) τ E + D (cid:1) (cid:28) . (3.58)Recalling the deﬁnition (3.12) of f lay+ and that f ± are deﬁned in (3.9) similarly, welet m lay+ := (cid:90) ∂B f lay+ and ˆ f :=  t ∈ [0 , τ ) − − τ f − for t ∈ [ τ, ) − τ f + for t ∈ [ , − τ ) . (3.59)Notice that since (cid:82) ∂B f lay+ = (cid:82) ∂B ρ lay+ , by Cauchy-Schwarz and (3.16), m lay+ (cid:46) (cid:0) τ E + D (cid:1) . (3.60)Moreover, since f + ⊥ f − (recall Lemma 3.3), (cid:90) ∂B (cid:90) − τ ˆ f (cid:46) (cid:90) ∂B f + f − (3.10) = (cid:90) ∂B f (cid:46) (cid:90) ∂B (cid:90) − τ f , where in the last inequality we used Jensen’s inequality. This yields (cid:90) ∂B (cid:90) − τ ( f − ˆ f ) (cid:46) (cid:90) ∂B (cid:90) − τ f (cid:46) τ − d E. (3.61) − m lay+ | B | f − τρ + − ρ − − τ − m lay+ | B | − | B | (cid:82) ∂B ρ + − ρ − − f + − τ ρ bulk ∂B ∂B Figure 5: The deﬁnition of ρ bulk . For a boundary layer size r (cid:29) (cid:0) τ − d E (cid:1) d +1 (cid:38) (cid:16)(cid:82) ∂B (cid:82) − τ ( f − ˆ f ) (cid:17) d +1 let A r := B \ B − r , and let ( s, q ) be given by [23, Lemma 3.4] applied to f − ˆ f and to the timeinterval (0 , − τ ) instead of (0 , s, q ) is such that it has supportin A r × [0 , − τ ], | s | ≤ and for ζ ∈ C ( R d × [0 , (cid:90) B (cid:90) ∂ t ζs + ∇ ζ · q = (cid:90) A r (cid:90) − τ ∂ t ζs + ∇ ζ · q = (cid:90) ∂B (cid:90) − τ ζ ( f − ˆ f ) . In addition it satisﬁes the estimate (cid:90) A r (cid:90) − τ | q | (cid:46) r (cid:90) ∂B (cid:90) − τ ( f − ˆ f ) (cid:46) rτ − d E. (3.62)We then let ( ρ bulk , j bulk ), supported in B × [0 , − τ ], be deﬁned through (see Figure5) ρ bulk := 1 − m lay+ | B | + s − | B | (cid:90) ∂B ( ρ + − ρ − ) ×  t ∈ [0 , τ ] t − τ − τ for t ∈ [2 τ, − τ ]1 for t ∈ [1 − τ, − τ ]and j bulk := q + ∇ ϕ ×  t ∈ [0 , τ ] − τ for t ∈ [2 τ, − τ ]0 for t ∈ [1 − τ, − τ ] , o that by deﬁnition (3.34) of ϕ (cid:90) B (cid:90) ∂ t ζρ bulk + ∇ ζ · j bulk = (cid:90) B ζ − τ (cid:32) − m lay+ | B | − | B | (cid:90) ∂B ( ρ + − ρ − ) (cid:33) − ζ (cid:32) − m lay+ | B | (cid:33) + (cid:90) ∂B (cid:90) − τ ζ (cid:18) χ (2 τ, − τ ) − τ ( ρ + − ρ − ) + f − ˆ f (cid:19) . (3.63)Notice that thanks to (3.58) also (3.55) is satisﬁed.Now, we can start estimating (cid:90) B − r (cid:90) − τ (cid:12)(cid:12)(cid:12) | j bulk | − ρ in |∇ ϕ | (cid:12)(cid:12)(cid:12) . By deﬁnition of ρ bulk , since s vanishes in B − r × (0 , − τ ), | − ρ in | (cid:46) ρ lay + m lay+ + (cid:90) ∂B ( ρ + + ρ − ) (3.58)&(3.60) (cid:46) (cid:0) τ E + D (cid:1) + (cid:18)(cid:90) ∂B ρ + ρ − (cid:19) (3.15) (cid:46) D + E (cid:28) B − r × (0 , − τ ) . Therefore, since q also vanishes on B − r × (0 , − τ ), (cid:90) B − r (cid:90) − τ (cid:12)(cid:12)(cid:12) | j bulk | − ρ in |∇ ϕ | (cid:12)(cid:12)(cid:12) (cid:46) ( D + E + τ ) (cid:90) B |∇ ϕ | (cid:46) E + D E + τ E (cid:46) E + τ E + D, where in the last line we used Young’s inequality together with the fact that since E (cid:28) E (cid:46) E . Choosing r to be a large multiple of (cid:0) τ − d E (cid:1) d +1 , we have (cid:90) A r (cid:90) − τ (cid:12)(cid:12)(cid:12) | j bulk | − ρ in |∇ ϕ | (cid:12)(cid:12)(cid:12) (3.55) (cid:46) (cid:90) A r |∇ ϕ | + (cid:90) A r (cid:90) − τ | q | (cid:46) r (cid:16) E + τ − d E (cid:17) (3.14) (cid:46) (cid:16) τ − d E (cid:17) d +2 d +1 . Combining these two estimates and taking into account that since τ (cid:28) E (cid:28) E (cid:46) (cid:0) τ − d E (cid:1) d +2 d +1 , we ﬁnd (3.57). Notice also for further reference that using thesame argument, we obtain (cid:90) B (cid:90) − τ | j bulk | (cid:46) E + (cid:16) τ − d E (cid:17) d +2 d +1 . (3.64) f − τ ρ − − τ − τf + − τρ + τ∂B ρ bdr1 τ∂B ρ bdr2 ∂B ∂B Figure 6: The deﬁnition of ρ bdr , bulk . Step 2.

We now deﬁne ( ρ bdr , bulk , j bdr , bulk ), supported in ∂B × [0 , − τ ] so that (cid:90) ∂B (cid:90) ∂ t ζρ bdr , bulk + ∇ bdr ζ · j bdr , bulk = (cid:90) ∂B (cid:90) − τ ζ (cid:18) ˆ f − χ (2 τ, − τ ) − τ ( ρ + − ρ − ) (cid:19) (3.65)holds and (cid:90) ∂B (cid:90) ρ bdr , bulk | j bdr , bulk | (cid:46) | log τ | E d +3 d +2 . (3.66)Notice that combining (3.63) and (3.65) yields (cid:90) B (cid:90) ∂ t ζρ bulk + ∇ ζ · j bulk + (cid:90) ∂B (cid:90) ∂ t ζρ bdr , bulk + ∇ bdr ζ · j bdr , bulk = (cid:90) B ζ − τ (cid:32) − m lay+ | B | − | B | (cid:90) ∂B ( ρ + − ρ − ) (cid:33) − ζ (cid:32) − m lay+ | B | (cid:33) + (cid:90) ∂B (cid:90) − τ ζf. (3.67)We make the ansatz ( ρ bdr , bulk , j bdr , bulk ) := ( ρ bdr1 + ρ bdr2 , j bdr1 + j bdr2 ) (see Figure 6)requiring that (cid:90) ∂B (cid:90) ∂ t ζρ bdr1 + ∇ bdr ζ · j bdr1 = (cid:90) ∂B (cid:90) − ττ ζ (cid:18) χ ( , − τ ) − τ f + − χ (2 τ, − τ ) − τ ρ + (cid:19) (3.68)and (cid:90) ∂B (cid:90) ∂ t ζρ bdr2 + ∇ bdr ζ · j bdr2 = (cid:90) ∂B (cid:90) − ττ ζ (cid:18) χ (2 τ, − τ ) − τ ρ − − χ ( τ, ) − τ f − (cid:19) , (3.69)so that by deﬁnition (3.59) of ˆ f , (3.65) holds. Let ( ρ bdr1 , j bdr1 ) be given by Lemma2.6 for f = ρ + , f = f + , a = 2 τ, b = 1 − τ, c = 12 , and d = 1 − τ o that 1( d − b ) − ( c − a ) log d − bc − a = − − τ log 2 τ − τ (cid:46) | log τ | . Thanks to (2.32), we have (3.68) and by (2.33) combined with (3.15), we have (cid:90) ∂B (cid:90) ρ bdr1 | j bdr1 | (cid:46) | log τ | E d +3 d +2 . (3.70)Similarly, using Lemma 2.6 with f = f − , f = ρ − , a = τ, b = 12 , c = 2 τ, and d = 1 − τ to deﬁne ( ρ bdr2 , j bdr2 ), we obtain that (3.69) holds and that (3.70) is also satisﬁes by( ρ bdr2 , j bdr2 ). By subadditivity this proves (3.66) Step 3.

We now deﬁne and estimate the quantities related to the terminal layer(in time). In this step we deal with the construction in the time interval [0 , − τ ](see Figure 7) and deﬁne ( ρ bdr , lay , j bdr , lay ) supported ∂B × [0 , − τ ] and ( ρ lay , j lay )supported in B × [0 , − τ ] such that (recall the deﬁnition (3.12) of f lay+ ) (cid:90) B (cid:90) − τ ∂ t ζρ lay + ∇ ζ · j lay + (cid:90) ∂B (cid:90) − τ ∂ t ζρ bdr , lay + ∇ bdr ζ · j bdr , lay = (cid:90) ∂B ζ − τ f lay+ − (cid:90) B ζ m lay+ | B | , (3.71)and (cid:90) B (cid:90) − τ | j lay | + (cid:90) ∂B (cid:90) − τ ρ bdr , lay | j bdr , lay | (cid:46) τ E + D. (3.72)Let ϕ lay be the solution of  ∆ ϕ lay = | B | (cid:82) ∂B ρ lay+ = m lay+ | B | in B ∂ϕ lay ∂ν = ρ lay+ on ∂B , with (cid:82) ∂B ϕ lay = 0 (recall that ρ lay+ was deﬁned in Lemma 3.4). By (2.1) and H¨older’sinequality combined with (3.16), (cid:90) B |∇ ϕ lay | (cid:46) τ E + D. (3.73)We then let ρ lay := (1 − t ) m lay+ | B | and j lay := 2 ∇ ϕ lay for t ∈ (0 ,

12 ) , − τ ρ lay+ 12 ∂B f lay+ − τρ bdr , lay ∂B ρ lay ρ lay+ m lay+ | B | ∂B ∂B Figure 7: The deﬁnition of ρ lay and ρ bdr , lay . and extend them by zero for t ∈ ( , − τ ). Note that (3.60) automatically impliesthe smallness hypothesis (3.58). In view of the boundary value problem deﬁning ϕ lay we have (cid:90) B (cid:90) − τ ∂ t ζρ lay + ∇ ζ · j lay = (cid:90) ∂B (cid:90) ζρ lay+ − (cid:90) B ζ m lay+ | B | , (3.74)and (3.73) translates into (cid:90) B (cid:90) − τ | j lay | (cid:46) τ E + D. (3.75)Let ( ρ bdr , lay , j bdr , lay ) be deﬁned by Lemma 2.6 with f = ρ lay+ , f = f lay+ , a = 0 , b = 12 , and c = d = 1 − τ so that 1( d − b ) − ( c − a ) log d − bc − a = 12 log 1 − τ − τ ) (cid:46) . For these choices, (2.32) turns into (cid:90) ∂B (cid:90) − τ ∂ t ζρ bdr , lay + ∇ ζ · j bdr , lay = (cid:90) ∂B ζ − τ f lay+ − (cid:90) ∂B (cid:90) ζρ lay+ so that combining with (3.74) we ﬁnd (3.79). By (2.33) and (3.16) we also obtain (cid:90) ∂B (cid:90) − τ ρ bdr , lay | j bdr , lay | (cid:46) τ E d +3 d +2 + τ E d +2 D, which combined with (3.75) and the fact that τ E d +3 d +2 + τ E d +2 D (cid:46) τ E + D gives(3.72). (cid:48) ρ lay − µ − f − − τ ∂B ρ layin ∂B Figure 8: The deﬁnition of ρ lay − and ρ layin . Step 4.

We are left with the construction in [1 − τ, ρ bdr , lay , j bdr , lay )supported ∂B × [1 − τ,

1] and ( ρ lay , j lay ) supported in B × [1 − τ,

1] such that (cid:90) B (cid:90) − τ ∂ t ζρ lay + ∇ ζ · j lay + (cid:90) ∂B (cid:90) − τ ∂ t ζρ bdr , lay + ∇ bdr ζ · j bdr , lay = (cid:90) B ζ dµ − (cid:90) B ζ − τ (cid:32) − m lay+ | B | − | B | (cid:90) ∂B ( ρ + − ρ − ) (cid:33) − (cid:90) ∂B ζ − τ f lay+ + (cid:90) ∂B (cid:90) − τ ζf (3.76)and (cid:90) B (cid:90) − τ ρ lay | j lay | + (cid:90) ∂B (cid:90) − τ ρ bdr , lay | j bdr , lay | (cid:46) τ E + τ − D. (3.77)Note that combining (3.71) and (3.76), we get (cid:90) B (cid:90) ∂ t ζρ lay + ∇ ζ · j lay + (cid:90) ∂B (cid:90) ∂ t ζρ bdr , lay + ∇ bdr ζ · j bdr , lay = (cid:90) ∂B (cid:90) − τ ζf + (cid:90) B ζ dµ − (cid:90) B ζ − τ (cid:32) − m lay+ | B | − | B | (cid:90) ∂B ( ρ + − ρ − ) (cid:33) − ζ m lay+ | B | . (3.78)The construction of ( ρ bdr , lay , j bdr , lay ) takes care of the outgoing ﬂux f lay+ (recall (3.13))in [1 − τ,

1] by deﬁning ρ bdr , lay := (cid:90) t f lay+ and j bdr , lay := 0 on [1 − τ, , and thus at no cost. The construction of ( ρ lay , j lay ) for t ∈ (1 − τ,

1) is done inseveral steps (see Figure 8). We ﬁrst take care of the incoming ﬂow f − (recall (3.8))in [1 − τ,

1] and to this purpose take the corresponding bulk density from X itself, (cid:90) B (cid:90) − τ ζdρ lay − := (cid:90) Ω χ − τ

1) anddenote by f thr+ the ﬂux coming from particles that enter and leave B during (1 − τ, B , (cid:90) ∂B (cid:90) − τ ζdf thr+ := (cid:90) Ω χ − τ ≤ t −

1] (recall the deﬁnitions (3.8) and (3.13)).By the same argument that led to (3.5) we have (cid:90) B (cid:90) − τ ∂ t ζρ lay − + ∇ ζ · j lay − = (cid:90) B ζ dµ − + (cid:90) ∂B (cid:90) − τ ζ ( f thr+ − f − ) , so that (cid:90) B (cid:90) − τ ∂ t ζρ lay − + ∇ ζ · j lay − + (cid:90) ∂B (cid:90) − τ ∂ t ζρ bdr , lay + ∇ bdr ζ · j bdr , lay = (cid:90) B ζ dµ − − (cid:90) ∂B ζ − τ f lay+ + (cid:90) ∂B (cid:90) − τ ζf. (3.79)Furthermore, by (2.15) (cid:90) B (cid:90) − τ ρ lay − | j lay − | = sup ξ ∈ C ( B × [1 − τ, , R d ) (cid:90) B (cid:90) − τ ξ · j lay − − | ξ | ρ lay − = sup ξ ∈ C ( B × [1 − τ, , R d ) (cid:90) Ω χ − τ

1) = µ − i.e. we need toconnect Λ to µ (cid:48) := µ − µ − . Since µ (cid:48) coincides with the measure deﬁned in (3.26), by(3.27), W (Λ , µ (cid:48) ) (cid:46) τ E + D. We can thus use the Benamou-Brenier formulation of optimal transport (2.13) toﬁnd ( ρ layin , j layin ) rescaled from [0 ,

1] to [1 − τ,

1] and such that, (cid:90) B (cid:90) − τ ∂ t ζρ layin + ∇ ζ · j layin = (cid:90) B ζ dµ (cid:48) − (cid:90) B ζ − τ (cid:32) − m lay+ | B | − | B | (cid:90) ∂B ( ρ + − ρ − ) (cid:33) (3.81)and (cid:90) B (cid:90) − τ ρ layin | j layin | (cid:46) τ E + τ − D. (3.82)We thus let for t ∈ [1 − τ, ρ lay := ρ lay − + ρ layin and j lay := j lay − + j layin . Combining(3.79) and (3.81) we obtain (3.76). Moreover, using the subadditivity of (cid:82) ρ | j | ,(3.80), (3.82) and the fact that in [1 − τ, j bdr , lay = 0 we conclude the proof of(3.77). Step 5.

Combining (3.67) and (3.78), we see that (3.53) holds. Plugging (3.57),(3.64), (3.66), (3.72) and (3.77), into (3.56), we ﬁnd (cid:90) R d (cid:90) (cid:101) ρ | (cid:101) j | − (cid:90) B |∇ ϕ | (cid:46) τ E + ( τ − d E ) d +2 d +1 + D + (cid:16) E + ( τ − d E ) d +2 d +1 (cid:17) (cid:0) τ E + D (cid:1) + τ E + D + | log τ | E d +3 d +2 + τ E + τ − D (cid:46) τ E + ( τ − d E ) d +2 d +1 + | log τ | E d +3 d +2 + τ − D, where we used Young’s inequality together with the fact that τ (cid:28) E + D (cid:28) τ − d E ) d +2 d +1 + | log τ | E d +3 d +2 is super-linear in E , there exists 0 < ε ( τ ) (cid:28) E ≤ ε ( τ ), ( τ − d E ) d +2 d +1 + | log τ | E d +3 d +2 (cid:46) τ E. Therefore, if E + D ≤ ε ( τ ), (cid:90) R d (cid:90) (cid:101) ρ | (cid:101) j | − (cid:90) B |∇ ϕ | (cid:46) τ E + τ − D, which together with (3.54) proves (3.52).Combining (3.37) and (3.52), we obtain our main estimate, Proposition 1.6 which wenow recall for the reader’s convenience. roposition. For every < τ (cid:28) , there exist positive constants ε ( τ ) and C ( τ ) such that if E + D (cid:28) ε ( τ ) , then letting ϕ be deﬁned via (3.34) we have (cid:90) B (cid:90) ρ | j − ρ ∇ ϕ | (cid:46) τ E + C ( τ ) D. (3.83)Arguing exactly as in [23, Proposition 4.6], using the Benamou-Brenier formula(2.13), Lemma 2.3 and the harmonicity of ∇ ϕ (where ϕ is deﬁned in (3.34)), thisresult can be translated into Lagrangian terms, which gives Proposition 1.5, i.e. Proposition.

For every < τ (cid:28) , there exist positive constants ε ( τ ) and C ( τ ) such that if E + D ≤ ε ( τ ) , then there exists a function ϕ with harmonic gradient in B and such that (cid:90) B | T − ( x + ∇ ϕ ) | (cid:46) τ E + C ( τ ) D (3.84) and (cid:90) B |∇ ϕ | (cid:46) E. (3.85)With this estimate at hand, we can now prove as in [23, Proposition 4.7] one stepof a Campanato iteration (recall that E ( µ, T, R ) and D ( µ, O, R ) are deﬁned in (3.1)and (3.2)), i.e. Theorem 1.4 which we now recall. Theorem.

For every < τ (cid:28) , there exist positive constants ε ( τ ) , C ( τ ) and θ > such that if E ( µ, T, R ) + D ( µ, O, R ) ≤ ε ( τ ) , then there exists a symmetric matrix B and a vector b ∈ R d such that | B − Id | + 1 R | b | (cid:46) E ( µ, T, R ) , (3.86) and letting ˆ x := B − x , ˆΩ := B − Ω and then ˆ T (ˆ x ) := B ( T ( x ) − b ) and ˆ µ := ˆ T χ ˆΩ d ˆ x, (3.87) we have E (ˆ µ, ˆ T , θR ) ≤ τ E ( µ, T, R ) + C ( τ ) D ( µ, O, R ) . (3.88) Proof.

The proof is analogous to the one of [23, Proposition 4.7] with minor modiﬁ-cations. Still, we give the proof for the reader’s convenience. By rescaling, we mayassume that R = 1 and we then recall that E = E ( µ, T,

1) and D = D ( µ, O, τ (cid:48) to be ﬁxed later on and let then ϕ be given by Proposition 1.5 for τ (cid:48) . We deﬁne b := ∇ ϕ (0), A := ∇ ϕ (0) and then B := e − A so that B is symmetric. Since ∇ ϕ is harmonic, we obtain from (3.85) and the mean value formula that (3.86) holds. eﬁning ˆ T and ˆ µ as in (3.87), we get E (ˆ µ, ˆ T , θ ) = 1 θ d +2 (cid:90) B ( B θ ) | det B | − | B ( T − b ) − B − x | (cid:46) θ d +2 (cid:90) B θ | T − b − B − x | (cid:46) θ d +2 (cid:90) B θ | T − ( x + ∇ ϕ ) | + 1 θ d +2 (cid:90) B θ | ( B − − Id − A ) x | + 1 θ d +2 (cid:90) B θ |∇ ϕ − b − Ax | (cid:46) θ d +2 (cid:90) B θ | T − ( x + ∇ ϕ ) | + | B − − Id − A | + θ − sup B θ |∇ ϕ − b − Ax | . Recalling that B = e − A , b = ∇ ϕ (0) and A = ∇ ϕ (0), we conclude using again themean value formula for ∇ ϕE (ˆ µ, ˆ T , θ ) (3.84) (cid:46) θ − ( d +2) ( τ (cid:48) E + C ( τ (cid:48) ) D ) + |∇ ϕ (0) | + θ sup B θ |∇ ϕ | ≤ C (cid:16) θ − ( d +2) ( τ (cid:48) E + C ( τ (cid:48) ) D ) + E + θ E (cid:17) . Choosing ﬁrst θ small enough so that C ( E + θ ) ≤ τ and then τ (cid:48) small enough sothat Cθ − ( d +2) τ (cid:48) ≤ τ , we see that we can guarantee that (3.88) is satisﬁed. T We now turn back to the optimal matching problem and combine Theorem 2.10 andTheorem 1.4 to obtain the desired quantitative estimate on the transport map. Letus recall that we work here in dimension d = 2.Let us recall that for for every dyadic L , we consider µ a realization of the Q L − periodicPoisson point process (see Section 2.4) and let T = T µ,L be the optimal transportmap between µ ( Q L ) L and µ for the periodic transport problem (2.24), i.e. a minimizerof W , per ( µ ( Q L ) L , µ ). By Theorem 2.10, there exist a constant c > r ∗ ,L such that sup L E L (cid:20) exp (cid:18) cr ∗ ,L log 2 r ∗ ,L (cid:19)(cid:21) < ∞ and such that if 2 r ∗ ,L ≤ L ,recalling (2.57), µ ( Q L ) L ∈ (cid:20) , (cid:21) , Spt µ ∩ B r ∗ ,L (cid:54) = ∅ . (4.1)For µ such that 2 r ∗ ,L ≤ L , we let ˆ µ := L µ ( Q L ) µ so that T is also the optimal transportmap (for the periodic transport problem (2.24)) between the Lebesgue measure andˆ µ . By (2.58) of Theorem 2.10, we have that for all dyadic (cid:96) with 2 r ∗ ,L ≤ (cid:96) ≤ L , (cid:96) W (cid:18) ˆ µ (cid:96) , ˆ µ ( Q (cid:96) ) (cid:96) (cid:19) (cid:46) log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) (4.2)so that by (2.58) also 1 L W , per (ˆ µ, (cid:46) log (cid:16) Lr ∗ ,L (cid:17)(cid:16) Lr ∗ ,L (cid:17) . (4.3)Let us recall (see Section 2.3 and in particular Remark 2.5 for more details) that by[14], there exists a convex function ψ on R such that the map T can be identiﬁedon R with a measurable selection of the subgradient ∂ψ of ψ .Let y L = y L ( µ ) := argmin Spt µ | y | (which is uniquely deﬁned P L − a.e.) and x L thebarycenter of its pre-image under T , i.e. x L = x L ( µ ) := 1 | T − ( y L ) | (cid:90) T − ( y L ) xdx, so that the map µ → x L is P L − measurable. Let us show that T ( x L ) = y L . Byconvexity of ψ the set ( ∂ψ ) − ( y L ) = ∂ψ ∗ ( y L ) is convex. Since ψ is diﬀerentiablea.e., we have | ( ∂ψ ) − ( y L ) \ T − ( y L ) | = 0 and | ( ∂ψ ) − ( y L ) | = µ ( y L ) > x L lies in the interior of ( ∂ψ ) − ( y L ). Since ∇ ψ = y L a.e. on this set, ψ is aﬃne inside( ∂ψ ) − ( y L ) and thus ψ is diﬀerentiable at x L with T ( x L ) = ∇ ψ ( x L ) = y L . Therefore,by the deﬁnition of y L and (4.1) we have for µ such that 2 r ∗ ,L ≤ L , T ( x L ) = y L and | y L | ≤ r ∗ ,L . (4.4)Finally, we can prove our main result, Theorem 1.1 which we recall for the convenienceof the reader. Theorem.

Let L (cid:29) be dyadic and µ be a Q L − periodic Poisson point process.Then, if µ is such that r ∗ ,L (cid:28) L , | x L | (cid:46) r ∗ ,L log (cid:18) Lr ∗ ,L (cid:19) (4.5) and for every r ∗ ,L ≤ (cid:96) ≤ L , (cid:96) (cid:90) B (cid:96) ( x L ) | T − ( x − x L ) | (cid:46) log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) . (4.6) Proof. Step 1. [The setup] By periodicity we have E := 1 L (cid:90) B L | T − x | (cid:46) L (cid:90) Q L | T − x | = 1 L W , per (ˆ µ, (4.3) (cid:46) log (cid:16) Lr ∗ ,L (cid:17)(cid:16) Lr ∗ ,L (cid:17) . (4.7) et (cid:101) µ := ∇ ψ χ B L dx ) so that T is the Euclidean optimal transport map between χ B L dx and (cid:101) µ . By the L ∞ bound (2.18), we have that (cid:101) µ = ˆ µ on Q L . Let ﬁnally T ( x ) := T ( x L + x ) − x L and µ := (cid:101) µ ( · + x L ) (4.8)so that (4.4) becomes T (0) = y L − x L with | y L | ≤ r ∗ ,L . (4.9)Denote by (cid:96) the largest dyadic (cid:96) such that B (cid:96) ( x L ) ⊂ Q L . Fix 0 < τ (cid:28) L ∞ bound (2.18), | x L | (cid:46) E L (cid:28) L so that (cid:96) ∼ L . By (4.7) we thus have that (recall (3.1) and (3.2)) E := E ( µ , T , (cid:96) ) = 1(2 (cid:96) ) (cid:90) B (cid:96) | T − x | and D := D ( µ , Q L − x L , (cid:96) ) = 1 (cid:96) W (cid:18)(cid:101) µ Q L , (cid:101) µ ( Q L ) | Q L | (cid:19) satisfy for L large enough E + D ≤ ε ( τ ) . (4.10)We also let B := Id, b := 0 , Ω := B L − x L and O := Q L − x L . Let θ > θ is dyadic i.e. θ = 2 − j for some j ∈ N . For k ≥

1, let (cid:96) k := θ k (cid:96) and notice that (cid:96) k is also dyadic since (cid:96) is. It is of course enough to show that (4.6) holds for (cid:96) = (cid:96) k .We now prove by induction that there exist C ∗ , C , C , C > k ≥ (cid:96) k ≥ C ∗ r ∗ ,L , we can ﬁnd a symmetricmatrix B k and a vector b k such that | B k − Id | + 1 (cid:96) k | b k | ≤ C log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , (4.11)and letting T k ( x ) := B k ( T k − ( B k x ) − b k ), Ω k := B − k Ω k − and µ k := T k χ Ω k wehave for E k := E ( µ k , T k , (cid:96) k ), E k ≤ C log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , (4.12)and T k is the optimal transport map between χ Ω k and µ k . Moreover, we may ﬁnd atarget set O k such that letting D k := D ( µ k , O k , (cid:96) k ), we have B (cid:96) k ⊂ O k and D k ≤ C log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) . (4.13) s we shall argue, letting A = Id , a = 0 and then for k ≥ A k := B k A k − and a k := B k a k − + B k b k , (4.14)that is A k = B k B k − · · · B and a k = k (cid:88) i =0 B k B k − · · · B i b i , this entails T k ( x ) = A k T ( A ∗ k x ) − a k (4.8) = A k T ( A ∗ k x + x L ) − ( A k x L + a k ) , (4.15)where A ∗ denotes the transpose of A , | A k − Id | (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) and | a k | (cid:46) k r ∗ ,L log (cid:18) Lr ∗ ,L (cid:19) . (4.16)Notice that (4.12) and (4.13) in particular imply that if (cid:96) k ≥ C ∗ r ∗ ,L , then E k + D k ≤ ε ( τ ) . (4.17) Step 2. [The iteration argument] By (4.10) the induction hypothesis is satisﬁed for k = 0. Let us assume that it holds for k − Step 2.1. [Proof of (4.11) and (4.12)] Thanks to (4.17) we may apply Theorem 1.4with R = (cid:96) k − and O = O k − (recall that we ﬁxed τ (cid:28)

1) to ﬁnd a symmetric matrix B k and a vector b k ∈ R such that | B k − Id | + 1 (cid:96) k | b k | ≤ CE k − ≤ CC log (cid:16) (cid:96) k − r ∗ ,L (cid:17)(cid:16) (cid:96) k − r ∗ ,L (cid:17) ≤ C log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , if C is taken large enough (depending only on C ). From this we see that (4.11) issatisﬁed. Moreover, by (3.88) E k ≤ τ E k − + C ( τ ) D k − ≤ C (cid:18) τ + C ( τ ) C C (cid:19) log (cid:16) (cid:96) k − r ∗ ,L (cid:17)(cid:16) (cid:96) k − r ∗ ,L (cid:17) . Now if C ≥ − τ C ( τ ) C , since the function f ( t ) := log tt is decreasing for t largeenough, if (cid:96) k = θ(cid:96) k − ≥ C ∗ r ∗ ,L for some universal constant C ∗ large enough then f ( (cid:96) k /r ∗ ,L ) ≥ f ( (cid:96) k − /r ∗ ,L ) and thus E k ≤ C f (cid:18) (cid:96) k − r ∗ ,L (cid:19) ≤ C f (cid:18) (cid:96) k r ∗ ,L (cid:19) = C log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , hich proves (4.12). Step 2.2. [Optimality of T k ] Since T k − is the optimal transport map between χ Ω k − and µ k − , by Brenier’s Theorem [31, Theorem 2.12], there exists a convex map ψ k − such that T k − = ∇ ψ k − . Then T k = ∇ ψ k for the convex function ψ k ( x ) := ψ k − ( B k x ) − b k · B k x so that T k is the optimal transport map between χ Ω k and µ k (see [31, Theorem 2.12]). Step 2.3. [Derivation of (4.16)] For k > i ≤ k we ﬁrst prove that for (cid:96) k /r ∗ ,L (cid:29) k (cid:88) j = i  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17) (cid:96) k r ∗ ,L . (4.18)Since the function log tt is decreasing for t large enough, we obtain k (cid:88) j = i  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  (cid:46) k (cid:88) j = i (cid:90) (cid:96)jr ∗ ,L(cid:96)j +1 r ∗ ,L log tt = (cid:90) (cid:96)ir ∗ ,L(cid:96)k +1 r ∗ ,L log tt ≤ (cid:90) ∞ (cid:96)k +1 r ∗ ,L log tt (cid:46) (cid:16) (cid:96) k +1 r ∗ ,L (cid:17) (cid:96) k +1 r ∗ ,L (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17) (cid:96) k r ∗ ,L , which proves (4.18).We can now make a downward induction on i to show that (4.11) implies that for k > i ≤ k | B k B k − · · · B i − Id | ≤ (cid:112) C k (cid:88) j = i  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  (4.19)which combined with (4.18) implies | B k B k − · · · B i − Id | (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) . (4.20)Notice that (4.20) in particular gives | B k B k − · · · B i | ≤ C ∗ largeenough. Estimate (4.20) contains the ﬁrst part of (4.16) and the second part would lso follow since for every i , | B k B k − · · · B i b i | (4.20) ≤ | b i | (4.11) ≤ (cid:18) C r ∗ ,L log (cid:18) Lr ∗ ,L (cid:19)(cid:19) . It thus remains to prove (4.19) which clearly holds for i = k by (4.11). Assume (4.19)holds for i . Then as already pointed out, (4.20) implies | B k B k − · · · B i | ≤ (cid:96) k r ∗ ,L large enough so that we can estimate | B k B k − · · · B i − − Id | ≤ | B k B k − · · · B i − Id | + | B k B k − · · · B i ( B i − − Id ) |≤ (cid:112) C k (cid:88) j = i  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  + | B k · · · B i || B i − − Id | (4.11) ≤ (cid:112) C k (cid:88) j = i  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  + 2 (cid:112) C  log (cid:16) (cid:96) i − r ∗ ,L (cid:17)(cid:16) (cid:96) i − r ∗ ,L (cid:17)  ≤ (cid:112) C k (cid:88) j = i −  log (cid:16) (cid:96) j r ∗ ,L (cid:17)(cid:16) (cid:96) j r ∗ ,L (cid:17)  . Step 2.4. [Proof of (4.13)] We ﬁrst notice that since θ (cid:28) (cid:96) k (cid:28) (cid:96) k − and we recallthat (cid:96) k − is dyadic. We set O k := A k Q (cid:96) k − − ( A k x L + a k ) (4.21)and notice that by the L ∞ bound (2.18) applied to T k we have | A k y L − ( A k x L + a k ) | (4.15)&(4.9) = | T k (0) | (cid:46) (cid:96) k E k so that from (4.12) in the form of E k (cid:28)

1, the ﬁrst part of (4.16) in the form of | A k − Id | (cid:28) | A k x L + a k | ≤ | A k y L − ( A k x L + a k ) | + | A k y L | (cid:46) (cid:96) k E k + r ∗ ,L (cid:28) (cid:96) k . (4.22)Therefore, using again that | A k − Id | (cid:28) B (cid:96) k ⊂ Q (cid:96)k − imply that B (cid:96) k ⊂ A k Q (cid:96) k − − ( A k x L + a k ) = O k so that the ﬁrst part of (4.13) holds.Let us prove that the second part of (4.13) also holds. Let (cid:101) T k be the optimal transportmap between the constant measure on Q (cid:96) k − and the restriction of the measure (cid:101) µ tothis set i.e. W (cid:18)(cid:101) µ Q (cid:96) k − , (cid:101) µ ( Q (cid:96) k − ) | Q (cid:96) k − | χ Q (cid:96)k − (cid:19) = (cid:90) Q (cid:96)k − | (cid:101) T k − y | (cid:101) µ ( Q (cid:96) k − ) | Q (cid:96) k − | . e then let for z ∈ O k , (cid:98) T k ( z ) := A k (cid:101) T k ( A − k ( z + A k x L + a k )) − ( A k x L + a k ). We ﬁrstshow that (cid:98) T k µ k ( O k ) | O k | χ O k = µ k O k . For this we notice that by deﬁnition of µ k , if (cid:101) µ Q (cid:96) k − = α (cid:88) i δ y i then µ k O k = α | det A k | (cid:88) i δ A k y i − ( A k x L + a k ) , (4.23)so that µ k ( O k ) = (cid:101) µ ( Q (cid:96)k − ) | det A k | and | O k | = | det A k || Q (cid:96) k − | . For ζ ∈ C ( O k ) we thus have (cid:90) O k ζ (cid:98) T k µ k ( O k ) | O k | = (cid:90) O k ζ ( A k (cid:101) T k ( A − k ( z + A k x L + a k )) − ( A k x L + a k )) µ k ( O k ) | O k | = (cid:90) Q (cid:96)k − ζ ( A k (cid:101) T k − ( A k x L + a k )) µ k ( O k ) | O k | | det A k | = (cid:90) Q (cid:96)k − ζ ( A k y − ( A k x L + a k )) µ k ( O k ) | O k | | Q (cid:96) k − | (cid:101) µ ( Q (cid:96) k − ) | det A k | d (cid:101) µ ( y )= (cid:90) Q (cid:96)k − ζ ( A k y − ( A k x L + a k )) | det A k | − d (cid:101) µ ( y ) (4.23) = (cid:90) O k ζdµ k , proving that indeed (cid:98) T k µ k ( O k ) | O k | χ O k = µ k O k . If we now use (cid:98) T k as competitor forthe optimal transport problem between µ k ( O k ) | O k | χ O k and µ k O k , we obtain D k ≤ (cid:96) k (cid:90) O k | (cid:98) T k − z | µ k ( O k ) | O k | = 1 (cid:96) k (cid:90) O k | A k (cid:101) T k ( A − k ( z + A k x L + a k )) − ( A k x L + a k ) − z | µ k ( O k ) | O k | (4.16)&(4.21) (cid:46) (cid:96) k (cid:90) Q (cid:96)k − | (cid:101) T k − y | (cid:101) µ ( Q (cid:96) k − ) | Q (cid:96) k − | (4.2) (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , where we used that (cid:101) µ = ˆ µ in Q (cid:96) k − . This concludes the proof of the second part of(4.13). Step 3. [Conclusion] We can thus iterate this procedure up to K = (cid:36) log (cid:96) C ∗ r ∗ ,L | log θ | (cid:37) ∼ log Lr ∗ ,L . By (4.16) we have | A − K a K | (cid:46) r ∗ ,L log (cid:18) Lr ∗ ,L (cid:19) . (4.24) sing the L ∞ bound (2.18) for T K , we obtain | x L + A − K a K | (cid:46) | A K T ( x L ) − ( A K x L + a K ) | + | T ( x L ) | (cid:46) | T K (0) | + | y L | (cid:46) r ∗ ,L (4.25)which together with (4.24) gives (4.5).We now prove (4.6). Since B (cid:96) k +1 ( x L ) ⊂ A ∗ k B (cid:96) k + x L and recalling that T k ( x ) = A k T ( A ∗ k x + x L ) − ( A k x L + a k ) (see (4.15)), we can ﬁrst estimate1 (cid:96) k +1 (cid:90) B (cid:96)k +1 ( x L ) | T − ( x + A − k a k ) | ≤ (cid:96) k +1 (cid:90) A ∗ k B (cid:96)k + x L | T − ( x + A − k a k ) | (cid:46) (cid:96) k (cid:90) B (cid:96)k | A k T ( A ∗ k y + x L ) − A k ( A ∗ k y + x L + A − k a k ) | (cid:46) (cid:96) k (cid:90) B (cid:96)k | A k T ( A ∗ k y + x L ) − ( A k x L + a k ) − y | + 1 (cid:96) k (cid:90) B (cid:96)k | ( Id − A k A ∗ k ) y | (cid:46) (cid:32) (cid:96) k (cid:90) B (cid:96)k | T k − y | (cid:33) + | Id − A k A ∗ k | (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) . (4.26)Since by deﬁnition (recall (4.14)) a i = B i a i − + B i b i , we have A − i a i − A − i − a i − = A − i − b i and thus for 1 ≤ k < K , we have | A − K a K − A − k a k | ≤ (cid:32) K (cid:88) i = k +1 | A − i a i − A − i − a i − | (cid:33) = (cid:32) K (cid:88) i = k +1 | A − i − b i | (cid:33) (cid:46) (cid:32) K (cid:88) i = k +1 | b i | (cid:33) (cid:46) r ∗ ,L ( K − k ) log (cid:18) (cid:96) k r ∗ ,L (cid:19) (cid:46) r ∗ ,L log (cid:18) (cid:96) k r ∗ ,L (cid:19) . (4.27)Noticing that by (4.25), it is enough to prove (4.6) with A − K a K instead of − x L , weconclude by (4.26) and (4.27) that1 (cid:96) k +1 (cid:90) B (cid:96)k +1 ( x L ) | T − ( x + A − K a K ) | (cid:46) (cid:32) (cid:96) k +1 (cid:90) B (cid:96)k +1 ( x L ) | T − ( x + A − k a k ) | (cid:33) + 1 (cid:96) k | A − K a K − A − k a k | (cid:46) log (cid:16) (cid:96) k r ∗ ,L (cid:17)(cid:16) (cid:96) k r ∗ ,L (cid:17) , and obtain (4.6). emark 4.1. We would like to highlight that, although our estimate (4.6) is notoptimal with respect to the power on the logarithmic term, estimate (4.26) leads tothe optimal estimate inf ξ ∈ R (cid:96) (cid:90) B (cid:96) ( x L ) | T − ( x − ξ ) | (cid:46) log (cid:16) (cid:96)r ∗ ,L (cid:17)(cid:16) (cid:96)r ∗ ,L (cid:17) ∀ r ∗ ,L ≤ (cid:96) ≤ L. The suboptimal rate in (4.6) comes from the bound (4.16) which does not take can-cellations into account.

In this section we show how Theorem 1.1 can be used to derive locally optimalcouplings between the Lebesgue measure and the Poisson measure on R . For thiswe will use the optimal transport maps T L = T µ,L constructed above and pass to thelimit as L → ∞ . Since the transport cost per unit volume diverges logarithmically,see [6] or Section 2.4, we will need to use a renormalization procedure. Therefore,while the approximating couplings enjoy strong stationarity properties, cf. (2.53), thelimiting couplings themselves will not. However, the shift stationarity property willbe shown to survive in the second-order increments of the corresponding Kantorovichpotentials. In order to set up the limit procedure, we need to equip the conﬁgurationspace Γ (see Section 2.4) and the set of potentials with a topology.We equip Γ with the topology obtained by testing against continuous and compactlysupported functions. Denote the space of all real-valued convex functions ψ : R → R by K . We equip K (and C ( R )) with the topology of uniform convergence on com-pact sets. Let us point out that with these topologies, both Γ and K are metrizable,which makes them Polish spaces. On P (Γ × K ), we will consider the weak topologygiven by testing against functions in C b (Γ × K ).Denote by (cid:98) ψ L the convex function on R such that T L = ∇ (cid:98) ψ L on Q L and (2.25) holds,i.e. ∇ (cid:98) ψ L ( x + z ) = ∇ (cid:98) ψ L ( x ) + z for all ( x, z ) ∈ R × ( L Z ) and ∇ (cid:98) ψ L dx = L µ ( Q L ) µ .Since x L (deﬁned above Theorem 1.1) is logarithmically diverging in L , we will needto translate either the Lebesgue measure or the Poisson measure by a logarithmicallydiverging factor in order to pass to the limit. Since the Lebesgue measure (on R ) isinvariant under such translations while the Poisson point process is not, it is betterto make this shift in the domain rather than in the image and set ψ L ( x ) := (cid:98) ψ L ( x + x L ) . (4.28)Note that ∇ ψ L dx = L µ ( Q L ) µ, (4.29) ∇ ψ L (0) = 0 and the Legendre conjugate ψ ∗ L of ψ L satisﬁes ψ ∗ L ( y ) = (cid:98) ψ ∗ L ( y ) − x L · y. y adding a constant to ψ L we may assume that ψ L (0) = 0. Notice that by (2.56)of Lemma 2.9 and recalling that D h ψ ∗ ( y ) = ψ ∗ ( y + h ) + ψ ∗ ( y − h ) − ψ ∗ ( y ), we stillhave D h ψ ∗ θ z µ,L ( y ) = D h ψ ∗ µ,L ( y + z ) . (4.30)Let us point out that because of the shift introduced in (4.28), the same invariancedoes not hold for ψ L .For a given µ ∈ Γ, the bound (4.6) directly translates into locally uniformly L − boundsfor ∇ ψ µ,L which by convexity of ψ µ,L yields compactness of ( ψ µ,L ) L in K (see (4.32)below). Therefore, up to subsequence, ψ µ,L converges locally uniformly to a convexfunction ψ µ satisfying ∇ ψ µ dx = µ . However since we do not have any uniquenessproperty of this limit, the subsequence depends a priori on µ and we need to pass tothe limit in the sense of Young measures. For this purpose, we ﬁrst deﬁne the mapΨ L : Γ → K , µ (cid:55)→ ψ L , which is measurable and depends only on µ Q L , and then deﬁne the probabilitymeasure q L ∈ P (Γ × K ) by q L := ( id, Ψ L ) P L = P L ⊗ δ Ψ L . (4.31)We will show that the sequence ( q L ) L is tight and that any limit point q gives fullmass to pairs ( µ, ψ ) such that ∇ ψ dx = µ and such that the second-order incrementsof ψ ∗ are shift covariant. The crucial ingredient is the following lemma that gives usa uniform control on the potentials ψ L . Lemma 4.2.

There exists a constant

C > such that for every dyadic L and every µ ∈ Γ such that r ∗ ,L (cid:28) L , there holds for x ∈ R | x | − Cr ∗ ,L ≤ ψ L ( x ) ≤ | x | + Cr ∗ ,L . (4.32) Therefore, letting for λ ∈ R , F λ := (cid:26) ψ ∈ K : 14 | x | − λ ≤ ψ ≤ | x | + λ (cid:27) (4.33) this means that if r ∗ ,L (cid:28) L , ψ L ∈ F λ for every λ ≥ Cr ∗ ,L .Proof. Let us prove that sup r ∗ ,L (cid:28) (cid:96) (cid:96) (cid:90) B (cid:96) |∇ ψ L − x | (cid:28) . (4.34)For (cid:96) ≤ L , this directly follows from (4.6) and the deﬁnition of ψ L . If now (cid:96) = kL for some k ∈ N , Q L − periodicity of the function ( ∇ ψ L − x ) yields1 (cid:96) (cid:90) Q (cid:96) |∇ ψ L − x | = 1 L (cid:96) (cid:90) Q L |∇ ψ L − x | ≤ L (cid:90) Q L |∇ ψ L − x | (cid:28) o that (4.34) can be also obtained for (cid:96) ≥ L . Letting f ( x ) := ψ L ( x ) − | x | , thisimplies together with the L ∞ − bound given by Lemma 2.3 that for r ∗ ,L (cid:28) (cid:96) ,sup B (cid:96) |∇ f | (cid:28) (cid:96), which can be rewritten as |∇ f ( x ) | (cid:28) | x | for r ∗ ,L (cid:28) | x | . Using f (0) = 0, we obtainfrom integration | f ( x ) | (cid:28) | x | for r ∗ ,L (cid:28) (cid:96) . Going back to the deﬁnition of f , thisconcludes the proof of (4.32).This lemma endows us with the necessary compactness to prove the main resultof this subsection, the convergence of (Ψ L ) L in terms of Young measures. This isprecisely Theorem 1.2 which we recall for the convenience of the reader. Theorem.

The sequence of probability measure ( q L ) L (cf (4.31) ) is tight in P (Γ ×K ) .Moreover, any accumulation point q satisﬁes the following properties:(i) The ﬁrst marginal of q is the Poisson point process;(ii) q almost surely ∇ ψ dx = µ ;(iii) for any h, z ∈ R and f ∈ C b (Γ × C ( R )) there holds (cid:90) Γ ×K f ( µ, D h ψ ∗ ) dq = (cid:90) Γ ×K f ( θ − z µ, D h ψ ∗ ( · − z )) dq. Proof. Step 1.

We start with tightness. Since trivially µ Q L → µ in Γ we have that P L → P weakly in P (Γ). In particular, the sequence ( P L ) L is tight and for any ε > ε ⊂ Γ such that for all L we have P L (Γ ε ) ≥ − ε. Since byTheorem 2.10 we have sup L E L (cid:20) exp (cid:18) cr ∗ ,L log(2 r ∗ ,L ) (cid:19)(cid:21) < ∞ , there is a constant λ suchthat for each L large enough P L ( { r ∗ ,L ≤ √ λ } ) ≥ − ε. Lemma 4.2 implies that Ψ L ( { r ∗ ,L ≤ √ λ } ) ⊂ F λ , so that q L (Γ × F λ ) ≥ − ε. (4.35)Because of convexity, local boundedness yields local compactness in uniform topology.Thus setting K ε := Γ ε × F λ we have that K ε is compact and q L (( K ε ) c ) ≤ ε, which proves tightness. Moreover, since P L → P weakly in P (Γ) item (i) is shown. Step 2.

To show (ii) we deﬁne for k, n ∈ N the set G k,n ⊂ Γ × K by (recall thedeﬁnition of F k given in (4.33)) G k,n := (cid:26) ( µ, ψ ) ∈ Γ × F k : (cid:18) − n (cid:19) µ ≤ ∇ ψ dx ≤ (cid:18) n (cid:19) µ (cid:27) nd put G := ∩ n ∈ N ∪ k ∈ N G k,n . The claim would follow provided we can prove that q ( G ) = 1.We ﬁrst show that for ﬁxed k, n ∈ N the set G k,n is closed. Let ( µ m , ψ m ) m ∈ N ∈ G k,n be a sequence converging to some ( µ, ψ ) ∈ Γ × K . Since F k is closed, we have ψ ∈ F k and we only need to prove that (cid:18) − n (cid:19) µ ≤ ∇ ψ dx ≤ (cid:18) n (cid:19) µ, (4.36)which by weak convergence of µ m to µ and the fact that ( µ m , ψ m ) satisﬁes (4.36),would be proven provided we show that ∇ ψ m dx weakly converges up to subse-quence to ∇ ψ dx .Let f ∈ C c ( R ) be ﬁxed and let us prove that up to subsequence, (cid:90) R f ( ∇ ψ m ) → (cid:90) R f ( ∇ ψ ) . (4.37)By local uniform convergence of the convex functions ψ m to ψ , if p m ∈ ∂ψ m ( x ) with p m → p , then p ∈ ∂ψ ( x ). Therefore, ∇ ψ m converges a.e. to ∇ ψ . Let r > f ⊂ B r . In order to apply the dominated convergence theorem and concludethe proof of (4.37), we need to prove that there exists R depending only on k and r such that if | x | ≥ B R , then |∇ ψ m ( x ) | ≥ r . This is a simple consequence of the factthat ψ m ∈ F k and the monotonicity of ∇ ψ m . Indeed, since ψ m ∈ F k ,14 | x | − k ≤ ψ m ≤ | x | + k so that at every point x of diﬀerentiability of ψ m , since14 | x | − k ≤ ψ m ( x ) − ψ m (0) ≤ ∇ ψ m ( x ) · x ≤ |∇ ψ m ( x ) || x | , we have |∇ ψ m ( x ) | ≥ | x | − k | x | . This gives the claim and shows that G k,n is indeed a closed set.Since G k,n is measurable, G = ∩ n ∪ k G k,n is also measurable. Let q be an accumulationpoint of ( q L ) L so that up to subsequence q L → q . For a given ε >

0, let us prove thatfor every n and for k large enough (depending only on ε ) and every L large enough(depending only on k , n and ε ), q L ( G k,n ) ≥ − ε. (4.38)Since q L (( G k,n ) c ) ≤ q L (Γ × ( F k ) c )+ q L (cid:18)(cid:26) ( µ, ψ ) : (cid:18) − n (cid:19) µ ≤ ∇ ψ dx ≤ (cid:18) n (cid:19) µ (cid:27) c (cid:19) , t is enough to prove that each of the terms on the right-hand side are smaller than ε for k , n and L large enough. The ﬁrst term is estimated in (4.35) and we just needto consider the second term. For every L > µ ∈ Γ, we have by (4.29) q L (cid:18)(cid:26) ( µ, ψ ) : (cid:18) − n (cid:19) µ ≤ ∇ ψ dx ≤ (cid:18) n (cid:19) µ (cid:27) c (cid:19) = P L (cid:20) µ ( Q L ) / ∈ L (cid:20) nn + 1 , nn − (cid:21)(cid:21) , which by Cram´er-Chernoﬀ’s bounds for the Poisson distribution with intensity L (see [12]) gives q L (cid:18)(cid:26) ( µ, ψ ) : (cid:18) − n (cid:19) µ ≤ ∇ ψ dx ≤ (cid:18) n (cid:19) µ (cid:27) c (cid:19) ≤ exp (cid:18) − C L n (cid:19) , concluding the proof of (4.38).Now for ﬁxed k, n ∈ N large enough, since G k,n is closed, we have by (4.38)1 − ε ≤ lim sup L q L ( G k,n ) ≤ q ( G k,n ) . Using that for every k, n ∈ N , G k,n +1 ⊂ G k,n and that G = ∩ n ∪ k G k,n , we obtainthat for every ε > q ( G ) ≥ − ε, which concludes the proof. Step 3.

To show (iii) ﬁx an accumulation point q and a subsequence, still denotedby ( q L ) L converging weakly to q . Since for ﬁxed λ >

0, the Legendre transform ψ → ψ ∗ is continuous from F λ to K , for every h ∈ R and λ > ψ (cid:55)→ D h ψ ∗ is continuous on F λ (recall (4.33)) with values in C ( R ). Hence, the convergence q L → q together with (4.35) readily implies for all f ∈ C b (Γ × C ( R )) that also (cid:90) Γ ×K f ( µ, D h ψ ∗ ) dq L → (cid:90) Γ ×K f ( µ, D h ψ ∗ ) dq. By (4.30), we have q L almost surely D h ψ ∗ µ = D h ψ ∗ θ z µ ( · − z ). Using the invariance of P L under θ an the deﬁnition of q L we have for ﬁxed z ∈ R (cid:90) Γ ×K f ( µ, D h ψ ∗ ) dq L = (cid:90) Γ f ( µ, D h ψ ∗ µ ) d P L = (cid:90) Γ f ( θ − z θ z µ, D h ψ ∗ θ z µ ( · − z )) d P L = (cid:90) Γ f ( θ − z µ, D h ψ ∗ µ ( · − z )) d P L = (cid:90) Γ ×K f ( θ − z µ, D h ψ ∗ ( · − z )) dq L . ince for ﬁxed z ∈ R , θ − z is continuous on Γ, for every such z and λ > µ, ψ ) → f ( θ − z µ, D h ψ ∗ ) ∈ C b (Γ × F λ ) so that by weak convergence q L → q combinedagain with (4.35) we have (cid:90) Γ ×K f ( θ − z µ, D h ψ ∗ ( · − z )) dq L → (cid:90) Γ ×K f ( θ − z µ, D h ψ ∗ ( · − z )) dq which implies the thesis. References [1] M. Ajtai, J. Koml´os, and G´abor Tusn´ady,

On optimal matchings. , Combinator-ica (1984), 259–264.[2] G. Alberti, R. Choksi, and F. Otto, Uniform energy distribution for an isoperi-metric problem with long-range interactions , J. Amer. Math. Soc. (2009),no. 2, 569–605.[3] L. Ambrosio, M. Colombo, G. De Philippis, and A. Figalli, Existence of Euleriansolutions to the semigeostrophic equations in physical space: the 2-dimensionalperiodic case , Comm. Partial Diﬀerential Equations (2012), no. 12, 2209–2227.[4] L. Ambrosio, N. Fusco, and D. Pallara, Functions of bounded variation andfree discontinuity problems , Oxford Mathematical Monographs, The ClarendonPress, Oxford University Press, New York, 2000.[5] L. Ambrosio, N. Gigli, and G. Savar´e,

Gradient ﬂows in metric spaces and in thespace of probability measures , second ed., Lectures in Mathematics ETH Z¨urich,Birkh¨auser Verlag, Basel, 2008.[6] L. Ambrosio, F. Stra, and D. Trevisan,

A PDE approach to a 2-dimensionalmatching problem , arXiv:1611.04960 (2016).[7] S. Armstrong, T. Kuusi, and J.-C. Mourrat,

Quantitative stochastic homoge-nization and large-scale regularity , ArXiv e-prints (2017).[8] S. N. Armstrong and C. K. Smart,

Quantitative stochastic homogenization ofconvex integral functionals , Ann. Sci. ´Ec. Norm. Sup´er. (4) (2016), no. 2,423–481.[9] M. Avellaneda and F.-H. Lin, Compactness methods in the theory of homoge-nization , Comm. Pure Appl. Math. (1987), no. 6, 803–847.[10] F. Barthe and C. Bordenave, Combinatorial optimization over two random pointsets. , S´eminaire de probabilit´es XLV, Cham: Springer, 2013, pp. 483–535.[11] S. Bobkov and M. Ledoux,

One-dimensional empirical measures, order statisticsand Kantorovich transport distances , preprint (2014), to appear in Mem. Am.Math. Soc.

12] S. Boucheron, G. Lugosi, and P. Massart,

Concentration inequalities , OxfordUniversity Press, Oxford, 2013.[13] S. Caracciolo, C. Lucibello, G. Parisi, and G. Sicuro,

Scaling hypothesis for theeuclidean bipartite matching problem , Physical Review E (2014), no. 1.[14] D. Cordero-Erausquin, Sur le transport de mesures p´eriodiques , C. R. Acad. Sci.Paris S´er. I Math. (1999), no. 3, 199–202.[15] D. Cordero-Erausquin, R. J. McCann, and M. Schmuckenschl¨ager,

A Rieman-nian interpolation inequality `a la Borell, Brascamp and Lieb , Invent. Math. (2001), no. 2, 219–257.[16] G. De Philippis and A. Figalli,

Partial regularity results in optimal transporta-tion , Trends in Contemporary Mathematics (Cham) (Vincenzo Ancona and Elis-abetta Strickland, eds.), Springer International Publishing, 2014, pp. 293–307.[17] ,

Partial regularity for optimal transport maps , Publ. Math. Inst. Hautes´Etudes Sci. (2015), 81–112.[18] S. Dereich, M. Scheutzow, and R. Schottstedt,

Constructive quantization: ap-proximation by empirical measures , Ann. Inst. Henri Poincar´e Probab. Stat. (2013), no. 4, 1183–1203.[19] V. Dobri´c and J.E. Yukich, Asymptotics for transportation cost in high dimen-sions. , J. Theor. Probab. (1995), no. 1, 97–118.[20] A. Figalli and Y.-H. Kim, Partial regularity of Brenier solutions of the Monge-Amp`ere equation , Discrete Contin. Dyn. Syst. (2010), no. 2, 559–565.[21] N. Fournier and A. Guillin, On the rate of convergence in Wasserstein distanceof the empirical measure , Probab. Theory Related Fields (2015), no. 3-4,707–738.[22] A. Gloria, S. Neukamm, and F. Otto,

A regularity theory for random ellipticoperators , ArXiv e-prints (2014).[23] M. Goldman and F. Otto,

A variational proof of partial regularity for optimaltransportation maps , arXiv:1704.05339 (2017).[24] , An ε - regularity result for optimal transportation maps between contin-uous densities , preprint (2018).[25] M. Huesmann and K.-T. Sturm, Optimal transport from Lebesgue to Poisson ,Ann. Probab. (2013), no. 4, 2426–2478.[26] M. Ledoux, On optimal matching of Gaussian samples , Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) (2017), no. Veroyatnost’ iStatistika. 25, 226–264.[27] ,

On optimal matching of Gaussian samples II , preprint (2018).[28] F. Santambrogio,

Optimal transport for applied mathematicians , Progressin Nonlinear Diﬀerential Equations and their Applications, vol. 87,Birkh¨auser/Springer, Cham, 2015, Calculus of variations, PDEs, and model-ing.

29] M. Talagrand,

The transportation cost from the uniform measure to the empiricalmeasure in dimension ≥ . , Ann. Probab. (1994), no. 2, 919–959.[30] M. Talagrand, Upper and lower bounds for stochastic processes: modern methodsand classical problems , vol. 60, Springer Science & Business Media, 2014.[31] C. Villani,

Topics in optimal transportation , Graduate Studies in Mathematics,vol. 58, American Mathematical Society, Providence, RI, 2003.[32] ,

Optimal transport , Grundlehren der Mathematischen Wissenschaften,vol. 338, Springer-Verlag, Berlin, 2009, Old and new., Grundlehren der Mathematischen Wissenschaften,vol. 338, Springer-Verlag, Berlin, 2009, Old and new.