A Damped Newton Algorithm for Generated Jacobian Equations
AA DAMPED NEWTON ALGORITHM FOR GENERATEDJACOBIAN EQUATIONS
ANATOLE GALLOUËT, QUENTIN MÉRIGOT, AND BORIS THIBERT
Abstract.
Generated Jacobian Equations have been introduced byTrudinger [Disc. cont. dyn. sys (2014), pp. 1663–1681] as a gener-alization of Monge-Ampère equations arising in optimal transport. Inthis paper, we introduce and study a damped Newton algorithm forsolving these equations in the semi-discrete setting, meaning that oneof the two measures involved in the problem is finitely supported andthe other one is absolutely continuous. We also present a numericalapplication of this algorithm to the near-field parallel refractor problemarising in non-imaging problems. Introduction
This paper is concerned with the numerical resolution of
Generated Jaco-bian equations , introduced by N. Trudinger [19] as a generalization of Monge-Ampère equations arising in optimal transport. Generated Jacobian equa-tions were originally motivated by inverse problems arising in non-imagingoptics in the near-field case [12, 7, 8] but they also apply to problems arisingin economy [16, 5]. A survey on these equations and their applications wasrecently written by N. Guillen [6]. The input for a generated Jacobian equa-tions are two probability measures µ and ν over two spaces X and Y , and a generating function G : X × Y × R → R . Loosely speaking, a scalar function ψ on Y is an Alexandrov solution to the generated jacobian equation if themap T ψ defined by T ψ ( x ) ∈ arg max y ∈ Y G ( x, y, ψ ( y )) transports µ onto ν , i.e. ν is the image of the measure µ under T ψ , denoted T ψ µ = ν. Note that one needs to impose some conditions on µ and G ensuring that themap T ψ is well-defined µ -almost everywhere. One can describe the meaningof this equation using an economic metaphor. We consider X as a set ofcustomers, Y as a set of products and G ( x, y, ψ ( y )) corresponds to the utilityof the product y for the customer x given a price ψ ( y ) . The probabilitymeasure µ and ν describe the distribution of customers and products. Themap T ψ can be described as the “best response” of customers given a pricemenu ψ : Y → R : each customer x ∈ X tries to maximize its own utility G ( x, y, ψ ( y )) over all products y ∈ Y : the maximizer, if it exists and isunique, is denoted T ψ ( x ) . Then, ψ is a solution to the generated jacobian Date : January 21, 2021. a r X i v : . [ c s . C G ] J a n ANATOLE GALLOUËT, QUENTIN MÉRIGOT, AND BORIS THIBERT equation if the best response map T ψ pushes the distribution of customersto the distribution of available products ν .In this article, we are interested in algorithms for solving the semi-discretecase, where the source measure µ is absolutely continuous with respect tothe Lebesgue measure on X ⊆ R d and the target measure ν is finitely sup-ported. Such discretization can be traced back to Minkowski, but have beenused more recently to solve Monge-Ampère equations [17], problems fromnon-imaging optics [4], more general optimal transport problems [10], butalso generated Jacobian equations [1]. In all the cited papers, the methodsare coordinate-wise algorithms with minimal increment and are similar tothe algorithm introduced by Oliker-Prussner [17]. The number of iterationsof these algorithms scales more than cubicly ( N , where N is the size ofthe support of ν ), making them limited to fairly small discretizations. Morerecently, Newton methods have been introduced to solve semi-discrete op-timal transport problems [11, 14]. In this paper, we show that newtoniantechniques can also be applied to Generated Jacobian equations under mildconditions on the generating function G . Semi-discrete optimal transport.
The semi-discrete setting refers to thecase where one is given an absolutely continuous probability measure µ (withrespect to the Lebesgues measure) supported on a domain X of R d and a dis-crete probability measure ν = (cid:80) y ν y δ y supported on a finite set Y . Given acost function c : X × Y → R , the optimal transport problem amounts to find-ing a function T : X → Y that minimizes the total cost (cid:82) X c ( x, T ( x )) dµ ( x ) under the condition µ ( T − ( y )) = ν y for any y ∈ Y . This problem can berecast, using Kantorovitch duality under some mild conditions on the cost c , into finding a dual potential ψ : Y → R that satisfies ∀ y ∈ Y µ (Lag y ( ψ )) = ν y ( MA )where Lag y ( ψ ) are the Laguerre cells defined by Lag y ( ψ ) = { x ∈ X | ∀ z ∈ Y, c ( x, y ) + ψ ( y ) ≤ c ( x, z ) + ψ ( z ) } . The application T ψ defined for x ∈ X by T ψ ( x ) = y if x ∈ Lag y ( ψ ) is thenan optimal transport map between µ and ν for the cost c , and satisfies inparticular T ψ µ = ν . Equation ( MA ) can be regarded as a discrete versionof the Monge-Ampère type equation arising in optimal transport. We referfor instance to [2, §2] for more details in the case c ( x, y ) = −(cid:104) x | y (cid:105) . Generated Jacobian equation.
The Generated Jacobian equation in thesemi-discrete setting has a very similar form. The problem also amounts tofinding a function ψ : Y → R that satisfies Equation ( MA ), but the Laguerrecells have a more general form and read Lag y ( ψ ) = { x ∈ X | ∀ z ∈ Y, G ( x, y, ψ ( y )) ≥ G ( x, z, ψ ( z )) } where G is called a generating function . When G is linear in the last variable,i.e. when G ( x, y, v ) = − c ( x, y ) − v , one obviously recovers the Laguerre cellsfrom optimal transport.Note that the lack of linearity in the generating function G adds severaltheoretical and practical difficulties. To see this, consider the mass function H : R Y → R Y , ψ (cid:55)→ ( µ (Lag y ( ψ ))) y ∈ Y . DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 3
In the optimal transport case, the function H is invariant under the additionof a constant (i.e. H ( ψ + c ) = H ( ψ ) for any c ∈ R ), which entails under mildassumptions that the kernel of D H ( ψ ) has rank one and coincides with thevector space of constant functions on Y [11]. Furthermore, as a consequenceof Kantorovitch duality, the function H is the gradient of a functional, called Kantorovitch functional in [11]. This implies that the differential D H ( ψ ) issymmetric. In the case of generated Jacobian equations, these two propertiesdo not hold anymore: the differential D H ( ψ ) is not necessarily symmetricand its kernel is not reduced to the set of constant functions in general.In this article, we generalize the damped Newton algorithm proposedin [11] to solve generated Jacobian equations. Note that unlike [11] we do notrequire any Ma-Trudinger-Wang type condition to prove the convergence ofour algorithm. In Section 2 we recall the notion of generating function andits properties, and introduce the generated Jacobian equation in the semi-discrete setting. Section 3 is entirely dedicated to the numerical resolutionof the generated Jacobian equation. In Section 4, we apply our algorithmto numerically solve the Near Field Parallel Reflector problem. Note thatF. Abedin and C. Gutierrez also consider this problem [1], but their algo-rithm requires a strong condition, called Visibility Condition , that impliesthe
Twist condition (defined hereafter) of the generating function G . Weshow that under a much weaker assumption, this twist condition holds for asubset of dual potential ψ : Y → R on which we can apply our algorithm. Itis very likely that our assumption could also be adapted to [1].2. Semi-discrete generated Jacobian equation
In this section, we recall the notions introduced by N. Trudinger in orderto define the generated Jacobian equation [19] in the semi-discrete setting.Let Ω be an open bounded domain of R d , let X be a compact subset of Ω and let Y be a finite set of R d . Let µ be a measure on Ω , which is absolutelycontinuous with respect to the Lebesgue measure, with non-negative density ρ supported on X (i.e. spt( ρ ) ⊂ X ), and let ν = (cid:80) y ∈ Y ν y δ y be a measure onthe finite set Y such that all ν y are positive ( ν y > ). These two measuresmust satisfy the mass balance condition µ ( X ) = ν ( Y ) and it is not restrictiveto view them as probability measures: (cid:90) X ρ ( x ) dx = (cid:88) y ∈ Y ν y = 1 Notations.
We denote by H k the k -dimensional Hausdorff measure in R d .In particular H d is the Lebesgue measure in R d . The set of functions from Y to R is denoted by R Y . We denote by (cid:104)·|·(cid:105) the Euclidean scalar product,by (cid:107) · (cid:107) the Euclidean norm, by B( x, r ) the Euclidean ball of center x andradius r , by χ A : R d → { , } the indicator function of a set A . The imageand kernel of a matrix M are respectively denoted by im( M ) and ker( M ) .We denote by span( u ) the linear space spanned by a vector u , by ∇ x G thegradient of a function G with respect to x and by ∂ v G its scalar derivativewith respect to v . Finally, for N ∈ N , we denote (cid:74) , N (cid:75) = { , . . . , N } . ANATOLE GALLOUËT, QUENTIN MÉRIGOT, AND BORIS THIBERT
Generating function.
We recall below the notion of generating func-tion and G -convexity in the semi-discrete setting [19, 1]. Definition 1 (Generating function) . Let a, b ∈ R ∪ {−∞ , + ∞} with a < b and I =] a, b [ . A function G : Ω × Y × I → R is called a generating function.We assume that it satisfies the following properties: • Regularity condition: ( x, y, v ) (cid:55)→ G ( x, y, v ) is continuously differen-tiable in x and v , and ∀ α < β ∈ I, sup ( x,y,v ) ∈ Ω × Y × [ α,β ] |∇ x G ( x, y, v ) | < + ∞ (Reg) • Monotonicity condition: ∀ ( x, y, v ) ∈ Ω × Y × I : ∂ v G ( x, y, v ) < (Mono) • Twist condition: ∀ x ∈ Ω , ( y, v ) (cid:55)→ ( G ( x, y, v ) , ∇ x G ( x, y, v )) is injective on Y × I (Twist) • Uniform Convergence condition: ∀ y ∈ Y, lim v → a inf x ∈ Ω G ( x, y, v ) = + ∞ (UC) Remark G ) . Through the whole paper we can and will considerthat I = R . Indeed suppose that G : Ω × Y × I → R satisfies the assumptionsof the above definition. Considering a strictly increasing C diffeomorphism ζ : R → I and setting (cid:101) G ( x, y, v ) = G ( x, y, ζ ( v )) , we get a generating function ˜ G : Ω × Y × R → R , which also satisfies the conditions above. Moreover, upto reparametrization, the generated Jacobian equations associated to G and ˜ G are equivalent. Remark . Note that F. Abedin and C. Gutierrez [1] impose a slightly morerestrictive inequality in condition (Reg): their supremum is taken over Ω × Y × ] − ∞ , α ] for any α instead of Ω × Y × [ α, β ] . We changed here thiscondition in order to handle the Near Field point reflector problem in thelast section. Definition 4 (G-convexity) . Let ϕ : Ω → R be a function. If ϕ ≥ G ( · , y , λ ) for all x ∈ Ω with equality at x = x , we say that the function G ( · , y , λ ) supports ϕ at x . A function ϕ : Ω → R is said to be G-convex if it issupported at every point, i.e. for all x ∈ Ω , ∃ ( y , λ ) ∈ Y × R s.t. (cid:40) ∀ x ∈ Ω , ϕ ( x ) ≥ G ( x, y , λ ) ϕ ( x ) = G ( x , y , λ ) (2.1) Remark . The notion of G-convexity generalizesin a certain sense the notion of convexity. Intuitively, it amounts to replacingthe supporting hyperplanes by functions of the form G ( · , y, λ ) . If G ( · , y, λ ) is convex for any y ∈ Y and any λ ∈ R , then a G -convex function is alwaysconvex. Moreover, if the generating function G is affine (i.e G ( x, y, λ ) = (cid:104) x, y (cid:105) + λ ) and if Y = R d , then the notions of G-convexity and convexity areequivalent. DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 5
Definition 6 (G-subdifferential) . Let ϕ be a G-convex function and let x ∈ Ω . The G-subdifferential ∂ G ϕ of ϕ at x is defined by ∂ G ϕ ( x ) = { y ∈ Y | ∃ λ ∈ R s.t. G ( · , y, λ ) supports ϕ at x } (2.2)The following lemma (Lemma 2.1 in [1]) shows that the ∂ G ϕ is single-valued almost everywhere, and induces a measurable Lemma 7. [1, Lemma 2.1]
Let ϕ be G-convex with G satisfying (Reg) , (Mono) and (Twist) . Then, there exists a measurable map S ϕ : Ω → Y s.t.for a.e. x ∈ Ω , ∂ G ϕ ( x ) = { S ϕ ( x ) } . We can define the notion of generated Jacobian equation.
Definition 8 (Brenier solution to the GJE) . A function ϕ : X → R is aBrenier solution to the generated Jacobian equation between a probabilitydensity µ on Ω and a probability measure ν = (cid:80) y ∈ Y ν y δ y on Y if it satisfies (cid:40) ϕ is G-convex ∀ y ∈ Y, µ ( S − ϕ ( { y } )) = ν y (GenJac)2.2. G-transform.
The goal in this section is to write a dual formulation ofthe generated Jacobian equation, using the notion of G -transform introducedby Trudinger [19]. Definition 9.
The G -transform ψ G : Ω → R of ψ : Y → R is defined by ∀ x ∈ Ω , ψ G ( x ) = max y ∈ Y G ( x, y, ψ ( y )) . (2.3) Proposition 10.
Assume G satisfies (Reg) , (Mono) and (Twist) and let ϕ : Ω → R be a G -convex function. Then there exists ψ : S ϕ (Ω) → R s.t. ∀ x ∈ Ω , ϕ ( x ) = max y ∈ S ϕ (Ω) G ( x, y, ψ ( y )) Proof.
Let y ∈ S ϕ (Ω) , then for any x ∈ S − ϕ ( y ) there exists λ ∈ R suchthat ϕ ( x ) = G ( x , y, λ ) . Since ϕ is G -convex we also have for any x ∈ Ω that ϕ ( x ) ≥ G ( x, y, λ ) . Specifically for x ∈ S − ϕ ( y ) , we get ϕ ( x ) = G ( x , y, λ ) ≥ G ( x , y, λ ) and since ∂ v G ( x, y, v ) < then λ ≤ λ . Bysymmetry we have λ = λ . We can deduce that there exists a unique ψ ( y ) ∈ R such that for any x ∈ S − ϕ ( y ) , ϕ ( x ) = G ( x, y, ψ ( y )) . This defines amap ψ : S ϕ (Ω) → R satisfying ∀ x ∈ Ω , (cid:40) ∀ y ∈ S ϕ (Ω) , ϕ ( x ) ≥ G ( x, y, ψ ( y )) ∃ y ∈ S ϕ (Ω) , ϕ ( x ) = G ( x, y, ψ ( y )) As a conclusion we have ϕ ( x ) = max y ∈ S ϕ (Ω) G ( x, y, ψ ( y )) . (cid:3) Corollary 11.
Let ϕ be a G -convex function such that S ϕ (Ω) = Y , thenthere exists ψ : Y → R such that ϕ = ψ G .Remark
12 ( G -convex functions are not always G -transforms) . Without anyadditional assumptions on the generating function, we cannot guarantee that
ANATOLE GALLOUËT, QUENTIN MÉRIGOT, AND BORIS THIBERT any G -convex function ϕ on X is the G -transform of a function ψ on Y .Define for instance Ω = (1 , , Y = { , } , G ( x, y, v ) = (cid:40) xe − v if y = 0 − xv if y = 1 . and consider the function ϕ on Ω defined by ϕ ( x ) = G ( x, ,
1) = − x , whichis G -convex by definition. Yet for any v ∈ R and any x ∈ Ω , max( G ( x, , v ) , G ( x, , xe − v , − x ) = xe − v , thus implying that that there does not exist any ψ : Y → R such that ϕ isthe G -transform of ψ .Suppose that ϕ is a solution of (GenJac) and that for all y ∈ Y , themass ν y is positive. Then, for any y ∈ Y one has µ ( S − ϕ ( y )) = ν y > ,which guarantees that S ϕ (Ω) = Y . Therefore by Corollary 11 there exists afunction ψ on Y such that ϕ = ψ G . This means that we can reparametrizethe problem (GenJac) by assuming that the solution ϕ is the G -transformof some function ψ . The sets S − ψ G ( { y } ) , which appear in (GenJac) will becalled generalized Laguerre cells. Definition 13 (Generalized Laguerre cells) . The generalized Laguerre cells associated to a function ψ : Y → R are defined for every y ∈ Y by Lag y ( ψ ) := S − ψ G ( { y } )= { x ∈ Ω | ∀ z ∈ Y, G ( x, y, ψ ( y )) ≥ G ( x, z, ψ ( z )) } . (2.4)Note that by Lemma 7, the intersection of two generalized Laguerre cellshas zero Lebesgue measure, ensuring that the sets Lag y ( ψ ) form a partitionof Ω up to a µ -negligible set. Definition 14 (Alexandrov solution to GJE) . A function ψ : Y → R is an Alexandrov solution to the generated Jacobian equation between generatedJacobian equation between a probability density µ on Ω and a probabilitymeasure ν = (cid:80) y ∈ Y ν y δ y on Y if ψ G is a Brenier solution (Definition 8) tothe same GJE, or equalently if ∀ y ∈ Y, H y ( ψ ) = ν y , where H y ( ψ ) = µ (Lag y ( ψ )) . Setting H ( ψ ) = ( H y ( ψ )) y ∈ Y and considering ν as a function over Y , we caneven rewrite this equation as H ( ψ ) = ν. (GenJacD)3. Resolution of the generated Jacobian equation
The goal of this section is to introduce and study a Newton algorithmto solve the semi discrete generated Jacobian equation (GenJacD). Beforedoing so, we study the regularity of the mass function H : R Y → R Y inSection 3.1 and establish a non-degeneracy property of its differential D H inSection 3.2, under a connectedness assumption on the support of the sourcemeasure. We present the algorithm and prove its convergence in Section 3.3.For simplicity, we will number the points in Y , i.e. we assume that Y = { y , . . . , y N } , DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 7 where the points y i are distinct. This allows us to identify the set of functions R Y with R N , by setting ψ i = ψ ( y i ) . We also denote ( e i ) ≤ i ≤ N the canonicalbasis of R N . Finally, we introduce a shortened notation for Laguerre cellsand intersections thereof Lag i ( ψ ) = Lag y i ( ψ ) , Lag ij ( ψ ) = Lag i ( ψ ) ∩ Lag j ( ψ ) . Throughout this section, we assume that the generating function G satisfiesall the conditions of Definition 1.3.1. C -regularity of H . The differentiability of H is established undera (mild) genericity hypothesis on the cost function, ensuring in particularthat the intersection between three distinct Laguerre cells is negligible withrespect to the ( d − -dimensional Hausdorff measure, denoted H d − . Towrite this hypothesis, we denote for three distinct indices i, j, k in (cid:74) , N (cid:75) , Γ ij ( ψ ) = { x ∈ Ω | G ( x, y i , ψ i ) = G ( x, y j , ψ j ) } , Γ ijk ( ψ ) = Γ ij ( ψ ) ∩ Γ ik ( ψ ) . Definition 15 (Genericity of the generating function.) . The generating func-tion G is generic with respect to Ω and Y if for any distinct indices i, j, k in (cid:74) , N (cid:75) and any ψ ∈ R N we have H d − (Γ ijk ( ψ )) = 0 . ( Gen Y Ω )The generating function G is generic with respect to the boundary ∂X and Y if for any distinct indices i, j in (cid:74) , N (cid:75) and any ψ ∈ R N we have H d − (Γ ij ( ψ ) ∩ ∂X ) = 0 . ( Gen
Y∂X ) Proposition 16.
Assume that • G ∈ C (Ω × Y × R ) satisfies (Reg) , (Mono) , (Twist) , ( Gen Y Ω ) , ( Gen
Y∂X ) , • X ⊆ Ω is compact and that ρ is a continuous probability density on X .Then the mass function H : R N → R N defined by H ( ψ ) = ( µ (Lag i ( ψ ))) ≤ i ≤ N has class C . We have for ψ ∈ R N and i ∈ (cid:74) , N (cid:75) ∂H j ∂ψ i ( ψ ) = (cid:90) Lag ij ( ψ ) ρ ( x ) | ∂ v G ( x, y i , ψ i ) |(cid:107)∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i ) (cid:107) d H d − ( x ) ≥ for j (cid:54) = i∂H i ∂ψ i ( ψ ) = − (cid:88) j (cid:54) = i ∂H j ∂ψ i ( ψ ) (3.5) Proof.
Let ψ ∈ R N and i, j ∈ (cid:74) , N (cid:75) be fixed indices such that i (cid:54) = j . Wewant to compute ∂H j /∂ψ i ( ψ ) . For this purpose, we introduce ψ t = ψ + te i for t ≥ . From (Mono), we obviously have Lag j ( ψ ) ⊆ Lag j ( ψ t ) . Therefore H j ( ψ t ) − H j ( ψ ) = µ (Lag j ( ψ t )) − µ (Lag j ( ψ )) = µ (Lag j ( ψ t ) \ Lag j ( ψ )) We introduce the set L obtained by removing one inequality in the definitionof the generalized Laguerre cell Lag j ( ψ ) : L = { x ∈ Ω | ∀ k (cid:54) = i, G ( x, y j , ψ j ) ≥ G ( x, y k , ψ k ) } . We have in particular
Lag j ( ψ ) ⊆ L and more precisely Lag j ( ψ t ) \ Lag j ( ψ ) = (cid:71)
We will use this formula to get another expression of H j ( ψ t ) − H j ( ψ ) . Step 1. Construction of u ij such that Γ ij ( ψ t ) = u − ij ( { t } ) . To constructsuch a function u ij : Ω → R , we first consider the function f ij : Ω × R → R defined by f ij ( x, t ) = G ( x, y j , ψ j ) − G ( x, y i , ψ i + t ) This function f ij is of class C on Ω × R by hypothesis on G and we have ∀ ( x, t ) ∈ Ω × R , ∂f ij ∂t ( x, t ) = − ∂ v G ( x, y i , ψ i + t ) > . This implies that a fixed x ∈ Ω , the function f ij ( x, · ) is strictly increasing,so that equation f ij ( x, t ) = 0 has at most one solution. Denoting V ij = { x ∈ Ω | ∃ t ∈ R , f ij ( x, t ) = 0 } = (cid:91) t ∈ R Γ ij ( ψ t ) , one can therefore define a function u ij : V ij → R which satisfies ∀ x ∈ V ij , f ij ( x, t ) = 0 ⇐⇒ u ij ( x ) = t. By the implicit function theorem, the set V ij is open and the function u ij is C on V ij . In order to apply the co-area formula, we need to compute thegradient of u ij . For any point x in V ij , we have by definition f ij ( x, u ij ( x )) = G ( x, y j , ψ j ) − G ( x, y i , ψ i + u ij ( x )) = 0 . Differentiating this expression with respect to x , we obtain ∇ u ij ( x ) = ∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i + u ij ( x )) ∂ v G ( x, y i , ψ i + u ij ( x )) which is well defined since ∂ v G ( x, y i , v ) < on Ω × Y × R by the (Mono)hypothesis. The (Twist) condition guarantees that for all x ∈ V ij , the map ( y, v ) (cid:55)→ ( G ( x, y, v ) , ∇ x G ( x, y, v )) is injective. By definition of u ij we have f ij ( x, u ij ( x )) , so that G ( x, y j , ψ j ) = G ( x, y i , ψ i + u ij ( x )) . The (Twist) condition then entails ∇ x G ( x, y j , ψ j ) (cid:54) = ∇ x G ( x, y i , ψ i + u ij ( x )) , implying that the gradient ∇ u ij ( x ) does not vanish. Step 2. Computation of the partial derivatives.
We can write thedifference between Laguerre cells using the function u ij : Lag j ( ψ t ) \ Lag j ( ψ ) = (cid:91)
Then the co-area formula gives us H j ( ψ t ) − H j ( ψ ) t = 1 t (cid:90) L ∩ u − ij (]0 ,t ]) ρ ( x ) dx = 1 t (cid:90) t H ij ( ψ s ) ds, where we introduced H ij ( ψ ) = (cid:90) Lag ij ( ψ ) ρ ( x ) (cid:107)∇ u ij ( x ) (cid:107) d H d − ( x ) . (3.6)Note that thanks to the computations above, we already know that thegradient ∇ u ij ( x ) does not vanish. Moreover, for any x in Lag ij ( ψ ) ⊆ Γ ij ( ψ ) ,one has u ij ( x ) = 0 . Thus, ∇ u ij ( x ) = ( ∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i )) / ( ∂ v G ( x, y i , ψ i )) . We can therefore rewrite H ij ( ψ ) = (cid:90) Lag ij ( ψ ) ρ ( x ) | ∂ v G ( x, y i , ψ i ) ||∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i ) | d H d − ( x ) . (3.7)As shown in Proposition 17 below, H ij is continuous on R N . We deduce that ∂H j ∂ψ i ( ψ ) = lim t → ,t> H j ( ψ t ) − H j ( ψ ) t = H ij ( ψ ) ≥ . (3.8)The case t < can be treated similarly by replacing Lag j ( ψ ) ⊆ Lag j ( ψ t ) with Lag j ( ψ t ) ⊆ Lag j ( ψ ) . We thus get the desired expression for the partialderivative ∂H j /∂ψ i for i (cid:54) = j .To compute the partial derivative for j = i , we use the mass conservationproperty (cid:80) ≤ i ≤ N H i ( ψ ) = 1 to deduce that ∂H i ∂ψ i ( ψ ) = − (cid:88) j (cid:54) = i ∂H j ∂ψ i ( ψ ) . (cid:3) It remains to show that the functions H ij used in the proof of Proposition16 are continuous. Proposition 17.
Under the assumptions of Proposition 16, for every i, j ∈ (cid:74) , N (cid:75) , the function H ij defined in (3.6) is continuous on R N .Proof. We introduce the function g : Ω × R N → R defined by g ( x, ψ ) = ¯ ρ ( x ) | ∂ v G ( x, y i , ψ i ) |(cid:107)∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i ) (cid:107) where ¯ ρ is a continuous extension of the probability density ρ | X on Ω . Fora given ψ ∈ R N , the (Twist) hypothesis guarantees that for any x ∈ Γ ij ( ψ ) , ∇ x G ( x, y j , ψ j ) (cid:54) = ∇ x G ( x, y i , ψ i ) . This implies that g is continuous on aneighborhood of the set { ( x, ψ ) ∈ Ω × R N | x ∈ Γ ij ( ψ ) } . We introduced inProposition 16 the function H ij ( ψ ) = (cid:90) Lag ij ( ψ ) ∩ X g ( x, ψ ) d H d − ( x ) . Let ψ ∞ ∈ R N and ψ n a sequence converging towards ψ ∞ .The main diffi-culty for proving that H ij ( ψ n ) converges to H ij ( ψ ∞ ) as n → + ∞ is that the integrals in the definition of H ij ( ψ n ) and H ij ( ψ ∞ ) are over different hy-persurfaces, namely Γ ij ( ψ n ) and Γ ij ( ψ ∞ ) . Our first step will therefore beto construct a diffeomorphism between (subsets) of these hypersurfaces. Weintroduce f : R × R × Ω → R the function defined by f ( a, b, x ) = G ( x, y j , ψ ∞ j + a ) − G ( x, y i , ψ ∞ i + b ) We put a n = ψ nj − ψ ∞ j and b n = ψ ni − ψ ∞ i , so that a n → and b n → as n tends to + ∞ . We also have Γ ij ( ψ ∞ ) = ( f (0 , , · )) − (0) , Γ ij ( ψ n ) = ( f ( a n , b n , · )) − (0) . Step 1: Construction of a map F n between Γ ij ( ψ ∞ ) and Γ ij ( ψ n ) . This map is constructed using the composition of the flows associated totwo vector fields X a and X b . Let (cid:101) Ω ⊂ Ω an open domain containing X .The (Twist) hypothesis guarantees that there exists a neighborhood (cid:101) V ofthe set { ( a, b, x ) ∈ R × (cid:101) Ω | f ( a, b, x ) = 0 } such that we have for any v ∈ (cid:101) V , ∇ x f ( v ) (cid:54) = 0 . We can then define two vector fields X a , X b on (cid:101) V by X a ( a, b, x ) = (cid:18) , , − ∂ a f ( a, b, x ) ∇ x f ( a, b, x ) (cid:107)∇ x f ( a, b, x ) (cid:107) (cid:19) X b ( a, b, x ) = (cid:18) , , − ∂ b f ( a, b, x ) ∇ x f ( a, b, x ) (cid:107)∇ x f ( a, b, x ) (cid:107) (cid:19) Since f is of class C , X a and X b are both of class C on (cid:101) V . We thenconsider Φ a and Φ b the flows associated respectively to X a and X b definedfor ( t, v ) ∈ [ − ε, ε ] × (cid:101) V by (cid:40) Φ a (0 , v ) = v∂ t Φ a ( t, v ) = X a (Φ( t, v )) and (cid:40) Φ b (0 , v ) = v∂ t Φ b ( t, v ) = X b (Φ( t, v )) The vector fields X a and X b are continuously differentiable on (cid:101) V whichimplies that both Φ a ( t, · ) and Φ b ( t, · ) converge pointwise in the C sensetoward the identity as t → . Let ( t, v ) ∈ [ − ε, ε ] × (cid:101) V . Denoting ∇ f ( v ) =( ∂ a f, ∂ b f, ∇ x f )( v ) , we then have f (Φ a ( t, v )) = f (Φ a (0 , v )) + (cid:90) t ∂∂s (cid:0) s (cid:55)→ f (Φ a ( s, v )) (cid:1) ds = f ( v ) + (cid:90) t (cid:104)∇ f (Φ a ( s, v )) | ∂ t Φ a ( s, v ) (cid:105) ds = f ( v ) + (cid:90) t (cid:104)∇ f (Φ a ( s, v )) | X a (Φ a ( s, v )) (cid:105) ds = f ( v ) Similarly one has f (Φ b ( t, v )) = f ( v ) . Let Π : (cid:101) V → Ω the projection of (cid:101) V ⊆ R × Ω on Ω , and let F n : Γ ij ( ψ ∞ ) ∩ (cid:101) Ω → Ω be the function defined by F n ( x ) = Π (cid:0) Φ a ( a n , Φ b ( b n , (0 , , x ))) (cid:1) . DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 11
For x ∈ Γ ij ( ψ ∞ ) and v = (0 , , x ) ∈ (cid:101) V , we have Φ a ( a n , Φ b ( b n , v )) = ( a n , b n , F n ( x )) and from the previous equality we deduce that f (Φ a ( a n , Φ b ( b n , v ))) = f ( v ) = 0 This means that for x ∈ Γ ij ( ψ ∞ ) , F n ( x ) ∈ Γ ij ( ψ n ) . Moreover Φ a ( a n , · ) and Φ b ( b n , · ) are both invertible of inverse Φ a ( − a n , · ) and Φ b ( − b n , · ) . Thus F n isalso invertible of inverse F − n ( x ) = Π (cid:0) Φ b ( − b n , Φ a ( − a n , ( a n , b n , x ))) (cid:1) Since both Φ a ( a n , · ) and Φ b ( b n , · ) converge pointwise in the C toward theidentity as n → + ∞ , we have for x ∈ Γ ij ( ψ ∞ ) ∩ (cid:101) Ω lim n → + ∞ F n ( x ) = x, lim n → + ∞ J F n ( x ) = 1 , where J F n is the absolute value of the determinant of the Jacobian matrixof F n . Step 2: Convergence of H ij ( ψ n ) toward H ij ( ψ ∞ ) . We let L ∞ = Lag ij ( ψ ∞ ) and L n = F − n (Lag ij ( ψ n ) ∩ (cid:101) Ω) . Denoting by χ A the indicator function of a set A , we have H ij ( ψ ∞ ) = (cid:90) Γ ij ( ψ ∞ ) g ( x, ψ ∞ ) χ X ( x ) χ L ∞ ( x ) d H d − ( x )= (cid:90) Γ ij ( ψ ∞ ) ∩ (cid:101) Ω g ( x, ψ ∞ ) χ X ( x ) χ L ∞ ( x ) d H d − ( x ) because Γ ij ( ψ ∞ ) ∩ X ⊂ (cid:101) Ω . We also have H ij ( ψ n ) = (cid:90) Γ ij ( ψ n ) g ( x, ψ n ) χ X ( x ) χ Lag ij ( ψ n ) ( x ) d H d − ( x ) By a change of variable from x to F n ( x ) , the latter equality becomes H ij ( ψ n ) = (cid:90) Γ ij ( ψ ∞ ) ∩ (cid:101) Ω g ( F n ( x ) , ψ n ) J F n ( x ) χ X ( F n ( x )) χ L n ( x ) d H d − ( x ) where J F n ( x ) denotes the determinant of the Jacobian matrix of F n . Wealready have the pointwise convergences F n ( x ) → x and J F n ( x ) → as n → ∞ . If we can show that lim n → + ∞ χ X ( F n ( x )) χ L n ( x ) = χ X ( x ) χ L ∞ ( x ) for H d − almost every point x , then using Lebesgue’s dominated convergencetheorem, we will obtain that H ij ( ψ n ) → H ij ( ψ ∞ ) .We first show that lim n → + ∞ χ L n ( x ) → χ L ∞ ( x ) H d − -almost everywhereon Γ ij ( ψ ∞ ) ∩ (cid:101) Ω . We first consider the superior limit: given x ∈ Γ ij ( ψ ∞ ) ∩ (cid:101) Ω ,we prove that lim sup n →∞ χ L n ( x ) ≤ χ L ∞ ( x ) . The limsup is non-zero if andonly if there exists a subsequence ( σ ( n )) n ∈ N such that ∀ n ∈ N , x ∈ L σ ( n ) . In this case we have F σ ( n ) ( x ) ∈ F σ ( n ) ( L σ ( n ) ) = Lag ij ( ψ σ ( n ) ) ∩ (cid:101) Ω . This meansthat for any k (cid:54) = i, jG ( F σ ( n ) ( x ) , y i , ψ σ ( n ) i ) = G ( F σ ( n ) ( x ) , y j , ψ σ ( n ) j ) ≤ G ( x, y k , ψ σ ( n ) k ) Since G is continuous the previous inequality passes to the limit n → ∞ ,showing that x ∈ L ∞ , and that lim sup n →∞ χ L n ( x ) ≤ χ L ∞ ( x ) We now want to show lim inf n →∞ χ L n ( x ) ≥ χ L ∞ ( x ) . If x / ∈ L ∞ the result isstraightforward. Let us consider the set S ij = (cid:91) k (cid:54) = i,j Γ ijk ( ψ ∞ ) ∪ (Γ ij ( ψ ∞ ) ∩ ∂X ) (3.9)By the genericity hypothesis (Definition 15) we have H d − ( S ij ) = 0 . If x ∈ L ∞ \ S ij , by definition we get for every k (cid:54)∈ { i, j } that x does not belongto Γ jk ( ψ ∞ ) . This implies a strict inequality G ( x, y i , ψ ∞ i ) = G ( x, y j , ψ ∞ j ) < G ( x, y k , ψ ∞ k ) . Since F n ( x ) converges to x and since ψ n converges to ψ ∞ , we get for n largeenough (cid:40) G ( F n ( x ) , y i , ψ ni ) < G ( F n ( x ) , y k , ψ nk ) G ( F n ( x ) , y j , ψ nj ) < G ( F n ( x ) , y k , ψ nk ) . Moreover since x ∈ Γ ij ( ψ ∞ ) , F n ( x ) ∈ Γ ij ( ψ n ) . Combining the inequalitiesabove, this shows that F n ( x ) belongs to Lag ij ( ψ n ) ∩ (cid:101) Ω = F n ( L n ) , i.e. x ∈ L n .This gives us ∀ x (cid:54)∈ S ij , lim inf n →∞ χ L n ( x ) ≥ χ L ∞ ( x ) . Consider x (cid:54)∈ S ij . For such x , we already know that χ L n ( x ) → χ L ∞ ( x ) as n → + ∞ . Thus, if x does not belong to L ∞ , we directly have lim n → + ∞ χ L n ( x ) χ X ( F n ( x )) = χ L ∞ χ X ( x ) = 0 . We may now assume that x belongs to L ∞ \ S ij . By definition of S ij , thisimplies that x / ∈ ∂X . We can directly deduce that χ X is continuous at x and that χ X ( F n ( x )) → χ X ( x ) when n → + ∞ .In conclusion we have that H ij ( ψ n ) → H ij ( ψ ∞ ) , so that H ij is continuous. (cid:3) Kernel and image of D H . The goal of this section is to prove Propo-sition 18 that gives properties on the differential of the mass function H . Weconsider the admissible set S + = (cid:8) ψ ∈ R N | ∀ i ∈ (cid:74) , N (cid:75) , H i ( ψ ) > (cid:9) . (3.10) Proposition 18.
In addition to the assumptions of Proposition 16, we as-sume that int( X ) ∩ { ρ > } is path-connected , where int( X ) is the interior of X . Then we have for any ψ ∈ S + DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 13 • The differential DH ( ψ ) has rank N − ; • The image of DH is im( DH ( ψ )) = ⊥ where = (1 , · · · , ∈ R N ; • For any w ∈ ker( DH ( ψ )) \ { } , we have for all i ∈ (cid:74) , N (cid:75) , w i (cid:54) = 0 and all w i have the same sign. The next two lemmas have already been included in the recent survey onoptimal transport involving the second and third authors [15], but we includethem here for completeness. The proof of Proposition 18 is different fromthe previous work in optimal transport because H is not symmetric. Lemma 19.
Let U ⊂ R d be a path-connected open set, and S ⊂ R d be aclosed set such that H d − ( S ) = 0 . Then, U \ S is path-connected.Proof. It suffices to treat the case where U is an open ball, the generalcase will follow by standard connectedness arguments. Let x, y ∈ U \ S bedistinct points. Since U \ S is open, there exists r > such that B( x, r ) and B( y, r ) are included in U \ S . Consider the hyperplane H orthogonal to thesegment [ x, y ] , and Π H the projection on H . Then, since Π H is 1-Lipschitz, H d − (Π H S ) ≤ H d − ( S ) = 0 , so that H \ Π H S is dense in the hyperplane H .In particular, there exists a point z ∈ Π H (B( x, r )) \ S = Π H (B( y, r )) \ S .By construction the line z + R ( y − x ) avoids S and passes through the balls B( x, r ) ⊂ U \ S and B( y, r ) ⊂ U \ S . This shows that the points x, y can beconnected in U \ S . (cid:3) We define for ψ ∈ R N the graph G ψ = ( V, E ) with vertex set V = { , . . . , N } with edges E = (cid:26) ( i, j ) ∈ V | ∂H i ∂ψ j ( ψ ) > (cid:27) We have the following result.
Lemma 20.
Under the assumptions of Proposition 18 and for ψ ∈ S + , thegraph G ψ is connected.Proof. Let Z = int( X ) ∩ { ρ > } , and S = (cid:83) ij S ij where S ij is definedin (3.9). From Lemma 19 the set Z \ S is path connected, we also have µ ( Z \ S ) = 1 since µ ( ∂X ) = µ ( S ) = 0 . Suppose that G ψ is not connected.Let i ∈ (cid:74) , N (cid:75) , and let I be the connected component of i in the graph G ψ . We thus have i ∈ I (cid:54) = (cid:74) , N (cid:75) . We consider the two non-empty sets U = (cid:91) i ∈ I Lag i ( ψ ) ∩ ( Z \ S ) and U = (cid:91) i/ ∈ I Lag i ( ψ ) ∩ ( Z \ S ) , which partition Z \ S up to a Lebesgue-negligible set. Moreover, since ψ ∈S + , U ∪ U = Z \ S, < µ ( U ) < , < µ ( U ) < . By construction U and U are closed sets in Z \ S . Since µ ( U i ) > we canpick x and y in Z \ S such that x ∈ U and y ∈ U . The Z \ S being path-connected, we know that there exists a path γ ∈ C ([0 , , Z \ S ) satisfying γ (0) = x and γ (1) = y . We let t = max { s ∈ [0 , | γ ( s ) ∈ U } and we are going to show that γ ( t ) ∈ U ∩ U . By construction, γ ( t ) obviously belongsto U . Now if t = 1 we have γ ( t ) = y ∈ U . If not, we have for all (cid:15) > that γ ( t + (cid:15) ) ∈ U . Since U is relatively closed in Z \ and since γ is continuous,we have γ ( t ) ∈ U . Naming z = γ ( t ) , there exists i ∈ I , j / ∈ I such that z ∈ Lag i ( ψ ) ∩ Lag j ( ψ ) . Moreover, since z / ∈ S we get that for any k / ∈ { i, j } , G ( z, y i , ψ i ) = G ( z, y j , ψ j ) > G ( z, y k , ψ k ) . By continuity of G we can deduce that there exists an open ball of radius r > such that ∀ x ∈ B( z, r ) , ∀ k / ∈ { i, j } , G ( x, y i , ψ i ) > G ( x, y k , ψ k ) This implies that B( z, r ) ∩ Γ ij ( ψ ) ⊂ Lag ij ( ψ ) where Γ ij ( ψ ) is defined in Definition 15. By (Twist) condition and the inver-sion function theorem, we know that Γ ij ( ψ ) is a d − dimensional manifoldand z ∈ Γ ij ( ψ ) . Moreover we have ρ ( z ) > because z ∈ Z and ρ is continu-ous on Z ⊂ X by hypothesis. We now have ∂H i ∂ψ j ( ψ ) = (cid:90) Lag ij ( ψ ) ρ ( x ) | ∂ v G ( x, y i , ψ i ) |(cid:107)∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i ) (cid:107) d H d − ( x ) ≥ (cid:90) B( z,r ) ∩ Γ ij ( ψ ) ρ ( x ) | ∂ v G ( x, y i , ψ i ) |(cid:107)∇ x G ( x, y j , ψ j ) − ∇ x G ( x, y i , ψ i ) (cid:107) d H d − ( x ) > which is a contradiction with the hypothesis that i and j are not connectedin the graph G ψ . (cid:3) Proof of Proposition 18.
We note the matrix M = DH ( ψ ) , with coefficients m i,j = ∂H i /∂ψ j ( ψ ) . We first show that ker( M T ) = span( ) . The inclusion ∈ ker( M T ) follows from N (cid:88) i =1 m i,j = ∂∂ψ j (cid:32) N (cid:88) i =1 H i ( ψ ) (cid:33) = 0 . Consider now v ∈ ker( M T ) , and pick an index i where v is maximum, i.e. i ∈ arg max ≤ i ≤ N v i . We have M T v ) i = N (cid:88) i =1 m i,i v i = (cid:88) i (cid:54) = i m i,i v i + m i ,i v i = (cid:88) i (cid:54) = i m i,i ( v i − v i ) . Since ψ ∈ S + , we have by Proposition 16 that for i (cid:54) = i , m i,i ≥ . Bydefinition of i we also have v i − v i ≤ . From all this we deduce that v i = v i for any i (cid:54) = i satisfying m i,i > , i.e. any vertex i adjacent to i in the graph G ψ . By connectedness of G ψ , we conclude that v = v i , thusshowing ker( M T ) = span( ) .We can deduce from this result that M is of rank N − because rk( M ) =rk( M T ) = N − . Moreover for any u ∈ R N , (cid:104) , M u (cid:105) = ( M u ) T = u T M T = 0 . Since the spaces im( M ) and ⊥ have the same dimension, we immediatelyget im( M ) = ⊥ . DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 15
Let w ∈ ker( M ) \ { } , we now want to show that for all i ∈ (cid:74) , N (cid:75) , w i (cid:54) = 0 and that all of the w i have the same sign. The proof consists in two steps: • Step 1: we show that w ≥ (or − w ≥ ). • Step 2: we show that for i ∈ (cid:74) , N (cid:75) , w i > .We define λ = max i | m i,i | and A = λI + M . With these definitions, v belongsto ker( M ) if and only if Av = λv . Moreover, for any i, j ∈ (cid:74) , N (cid:75) , one has a i,j ≥ and N (cid:88) k =1 a k,j = λ. Step 1:
Assume that there exists i ∈ (cid:74) , N (cid:75) such that w i ≥ (we can dothis without loss on generality, by working on − w otherwise). Suppose thatthere exists j (cid:54) = i such that a i ,j > and w j < , then since Aw = λw , wehave λw i = (cid:80) Nj =1 a i ,j w j and thus λ | w i | < (cid:80) Nj =1 a i,j | w j | . We also have forany i ∈ (cid:74) , N (cid:75) , λ | w i | ≤ (cid:80) Nj =1 a i,j | w j | . By summing this inequality on i andsince the inequality is strict when i = i , we obtain N (cid:88) i =1 λ | w i | < N (cid:88) i =1 N (cid:88) j =1 a i,j | w j | = N (cid:88) j =1 | w j | N (cid:88) i =1 a i,j = N (cid:88) j =1 λ | w j | , which is a contradiction, so we can affirm that there exists no index j (cid:54) = i such that w j < and a i ,j > . Since A = M + λI , for j (cid:54) = i , a i ,j = m i ,j .We thus have ∀ j ∈ (cid:74) , N (cid:75) , m i ,j > ⇒ w j ≥ . By connectedness of G we deduce w ≥ . Step 2:
If there exists i ∈ (cid:74) , N (cid:75) such that w i = 0 , then (cid:80) j a i,j w j = 0 .Recall that by construction a i,j ≥ and with step 1 w j ≥ , so we have ∀ j, a i,j > ⇒ w j = 0 . Again by connectedness of G we have w = 0 . (cid:3) Remark . Remark that a part of the proof of Proposition 18 could also beseen as a consequence of the Perron Frobenius theorem, using the notionsof irreducible and stochastic matrices. The matrix A = M + λI can bewritten A = λS where S T is a stochastic matrix. The matrix S is thusof spectral radius and A is of spectral radius λ . Since M is irreducible, A is also irreducible. Perron Frobenius Theorem then implies that λ is asimple eigenvalue with an associated eigenvector w satisfying w i > for any i ∈ (cid:74) , N (cid:75) . Since Av = λv ⇐⇒ M v = 0 , we can deduce that rk( M ) = N − and ker( M ) = span( w ) . Moreover since ∈ ker( M T ) , we have for any u ∈ R N , (cid:104) , M u (cid:105) = ( M u ) T = u T M T = 0 and im( M ) = ⊥ .3.3. Damped Newton algorithm.
In this section, we present a dampedNewton algorithm to solve the generated Jacobian equation (GenJacD),namely H ( ψ ) = ν . For this purpose we define in the following lemma anadmissible set of variable that can be used in our algorithm. Lemma 22 (Admissible set) . Suppose that the hypothesis of Proposition 16are satisfied. For any δ > , there exists α ∈ R such that the set S α,δ := (cid:8) ψ ∈ R N | ψ = α and ∀ i ∈ (cid:74) , N (cid:75) , H i ( ψ ) ≥ δ (cid:9) ⊂ S + (3.11) is a compact subset of R N . Furthermore for δ small enough, the set (3.11) is non-empty.Proof. Let γ ∈ R and M = max ( x,y ) ∈ X × Y G ( x, y, γ ) , where M is finite thanksto the continuity of G and compactness of X × Y . From the condition (UC),there exists α ∈ R such that min x ∈ X G ( x, y , α ) > M . If ψ ∈ R N is suchthat ψ = α and ψ i > γ for some i ≥ , then using (Mono), ∀ x ∈ X, G ( x, y , α ) > M ≥ G ( x, y i , γ ) ≥ G ( x, y i , ψ i ) , thus implying that Lag i ( ψ ) = ∅ , and in particular ψ (cid:54)∈ S α,δ . We arguesimilarly to show an upper bound on the elements of S α,δ : by (UC), thereexists β ∈ R such that min ( x,y ) ∈ X × Y G ( x, y, β ) > max x ∈ X G ( x, y , α ) . If ψ ∈ R N is such that ψ = α and ψ i < β for some i ≥ , then using (Mono),we get ∀ x ∈ X, G ( x, y i , ψ i ) ≥ G ( x, y i , β ) > G ( x, y , α ) , thus showing that Lag ( ψ ) = ∅ , so that ψ (cid:54)∈ S α,δ . The set S α,δ can bewritten as S α,δ = { α } × ∩ H − ([ δ, N ) , and is therefore closed by continuityof H . The previous computations show that S α,δ ⊆ { α } × [ β, γ ] N − , provingthat S α,δ is compact.Now suppose that δ ≤ / N − , then we can iteratively construct a vector ψ ∈ S α,δ in the following way. We start from ψ = ( α, γ, · · · , γ ) ∈ R N . Wethen have H ( ψ ) = 1 and for any i ≥ , H i ( ψ ) = 0 . Then for all i from to N can decrease ψ i such that H i ( ψ ) = 1 / i − . Then after iteration i we have ∀ k < i, H k ( ψ ) ≥ k − − (cid:88) k +1 ≤ j ≤ i j − = 12 i − After iteration N we thus have that for all i ∈ (cid:74) , N (cid:75) , H i ( ψ ) ≥ / N − ≥ δ ,and since ψ = α has not been changed during the process we have ψ ∈ S α,δ and S α,δ (cid:54) = ∅ . (cid:3) The differential of H is not invertible, but we can still define a Newton’sdirection by fixing one coordinate: Proposition 23 (Newton’s direction) . Under the assumptions of Proposi-tion 18, the system (cid:40) DH ( ψ ) u = H ( ψ ) − νu = 0 (3.12) has a unique solution in R N .Proof. Notice that from Proposition 18, DH ( ψ ) is of rank N − and since H ( ψ ) − ν ∈ ⊥ = im( DH ( ψ )) , the set S = { u ∈ R N | DH ( ψ ) u = H ( ψ ) − ν } is of dimension 1. For u ∈ S and w ∈ ker( DH ( ψ )) \ { } , S = { u + tw, t ∈ R } . Since w (cid:54) = 0 for w ∈ ker( DH ( ψ )) \ { } , system (3.12) has a uniquesolution. (cid:3) Theorem 24 (Linear convergence) . Assume the following assumptions: • the generating function G ∈ C (Ω × Y × R ) satisfies the assumptions (Reg) , (Mono) , (Twist) , (UC) , ( Gen Y Ω ) , ( Gen
Y∂X ) , • X ⊆ Ω is compact and ρ is a continuous probability density on X . DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 17
Algorithm 1
Damped Newton algorithm to solve (GenJacD)
Require: (cid:15) > ; initialization ψ ∈ S α,δ where δ ≤ min i ν i / Ensure: ψ such that (cid:107) H ( ψ ) − ν (cid:107) ≤ (cid:15) k ← while (cid:107) H ( ψ k ) − ν (cid:107) > (cid:15) do Define u k as the solution of the linear system (cid:40) DH ( ψ k ) u = H ( ψ k ) − νu = 0 Compute τ k by backtracking, i.e. τ k = max { τ ∈ − N | ψ k,τ = ψ k − τ u k ∈ S α,δ and (cid:107) H ( ψ k,τ ) − ν (cid:107) ≤ (1 − τ (cid:107) H ( ψ k ) − ν (cid:107) ψ k +1 ← ψ k − τ k u k and k ← k + 1 return ψ k • int( X ) ∩ { ρ > } is is path-connected.Then, there exists τ ∗ ∈ ]0 , such that the iterates of Algorithm 1 satisfy (cid:107) H ( ψ k ) − ν (cid:107) ≤ (cid:18) − τ ∗ (cid:19) k (cid:107) H ( ψ ) − ν (cid:107) . In particular, Algorithm 1 terminates.Proof.
Let ψ ∈ S α,δ , we define the set K δ = { ψ ∈ S α,δ , (cid:107) H ( ψ ) − ν (cid:107) ≤ (cid:107) H ( ψ ) − ν (cid:107)} Since the function H is continuous, the set K δ is non-empty and compact.Note that system (3.12) has N + 1 lines for N variables, and we know thatthe last line u = 0 , which can be written e T u = 0 , is linearly independentfrom the others. We can thus rewrite the system in the following form M ( ψ ) u = H ( ψ ) − ν (3.13)where M ( ψ ) = DH ( ψ ) + e e T . Obviously if u is a solution of (3.12) thenit is also a solution of (3.13). Now if u is a solution of (3.13), since e / ∈ im( DH ( ψ )) and H ( ψ ) − ν ∈ im( DH ( ψ )) , we have e e T u = e T ue = 0 whichmeans that the scalar e T u = 0 and thus, u satisfies (3.12). Since (3.13) has aunique solution, M ( ψ ) is thus invertible. Let u ψ solution of (3.13) for a given ψ . We have u ψ = M − ( ψ )( H ( ψ ) − ν ) . We thus have for any ψ ∈ K δ that (cid:107) u ψ (cid:107) ≤ (cid:107) M − ( ψ ) (cid:107) op (cid:107) ( H ( ψ ) − ν ) (cid:107) where (cid:107) · (cid:107) op denotes the operator normin M N ( R ) . The function ψ (cid:55)→ M ( ψ ) is continuous and M is invertible so ψ (cid:55)→ (cid:107) M − ( ψ ) (cid:107) op is also continuous and admits a maximum on the compactset K δ . We note C = max ψ ∈ K δ (cid:107) M − ( ψ ) (cid:107) op so we have for any ψ ∈ K δ , (cid:107) u ψ (cid:107) ≤ C (cid:107) H ( ψ ) − ν (cid:107) .Let ψ ∈ K δ and ψ τ = ψ − τ u ψ for τ ∈ [0 , . The first coordinate of ψ τ satisfies ψ τ = α . For a small τ we can write the Taylor expansion H ( ψ τ ) = H ( ψ ) − τ DH ( ψ ) u ψ + o ( τ (cid:107) u ψ (cid:107) )= H ( ψ ) − τ ( H ( ψ ) − ν ) + o ( τ (cid:107) H ( ψ ) − ν (cid:107) ) it follows that (cid:107) H ( ψ τ ) − ν (cid:107) = (1 − τ ) (cid:107) H ( ψ ) − ν (cid:107) + o ( τ (cid:107) H ( ψ ) − ν (cid:107) ) and thus there exists τ ψ > such that for all τ ∈ ]0 , τ ψ [ (cid:107) H ( ψ τ ) − ν (cid:107) ≤ (1 − τ (cid:107) H ( ψ ) − ν (cid:107) By compactness of K δ , this property holds on an uniform open range ]0 , τ [ .Moreover, coordinatewise we have for i ∈ (cid:74) , N (cid:75) , H i ( ψ τ ) = (1 − τ ) H i ( ψ ) + τ ν i + o ( τ (cid:107) H ( ψ ) − ν (cid:107) ) and since ν i ≥ δ there exists τ ψ > such that ∀ τ ∈ ]0 , τ ψ [ , ∀ i ∈ (cid:74) , N (cid:75) , H i ( ψ τ ) ≥ (1 + τ δ. Then again by compactness of K δ , there exists τ > such that for all ψ ∈ K δ and τ ∈ ]0 , τ [ , ψ τ ∈ S α,δ . This implies that the chosen τ k in thealgorithm will always be larger than τ ∗ = 12 min( τ , τ ) . By definition of the iterates, we deduce at one that (cid:107) H ( ψ k +1 ) − ν (cid:107) ≤ (1 − τ ∗ (cid:107) H ( ψ k ) − ν (cid:107) , thus proving the desired convergence result. (cid:3) Remark
25 (Existence) . Note that the convergence of the algorithm allowsto recover the existence of a solution to the semi-discrete generated Jacobianequation. To obtain this result we need the set S α,δ to be non-empty, whichis the case by Lemma 22 if δ is small enough.4. Application to the Near Field Parallel Reflector problem
The Near Field Parallel Reflector problem is a non-imaging optics problemthat cannot be recast as an optimal transport problem [12, 18], but that canbe written as a generated Jacobian equation [9, 1]. We show in this sectionthat we can apply the Damped Newton algorithm to solve this problem.The description of the problem is as follows. We have a collimated lightsource (i.e. all the rays are parallel and vertical) emitted from an horizontalplane X ⊂ R × { } , whose intensity is modeled by a probability measure µ on X . We also have a target light, which is modeled by a probabilitymeasure ν supported on a finite set Y ⊂ R × { } . The Near Field parallelreflector problem consists in finding the surface Σ of a mirror that reflectsthe measure µ to the measure ν . Let us denote by T Σ : X → Y the map thatassociates to any incident ray emanating from x ∈ X the reflected direction DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 19 T Σ ( x ) using Snell’s law of reflection. The Near Field refractor problem thenamounts to finding the mirror surface Σ such that for any point y ∈ Y , µ ( T − ( y )) = ν ( y ) . (NF paral) y y y Σ Figure 1.
A reflector composed of three paraboloids reflect-ing upward vertical rays toward the points y , y , y .4.1. Generated Jacobian equation.
In order to handle this problem, itis natural to consider paraboloid surfaces. Indeed, as illustrated in Figure 1,a paraboloid P ( y, /ψ ( y )) with focal point y ∈ R , focal distance /ψ ( y ) and direction (0 , , − reflects every upward vertical rays toward the focalpoint y . Since the target is finite, we choose to define the reflector surface Σ as the upper envelop of a finite family of paraboloids P ( y, /ψ ( y )) . Everyparaboloid P ( y, /ψ ( y )) being the graph of the function x (cid:55)→ / (2 ψ ( y )) − ψ ( y ) (cid:107) x − y (cid:107) / over X , the surface Σ is the graph of the function u ( x ) = max y ∈ Y ψ ( y ) − ψ ( y )2 (cid:107) x − y (cid:107) . We define G : Ω × Y × R ∗ + → R by G ( x, y, v ) = 12 v − v (cid:107) x − y (cid:107) (4.14)where Ω is a bounded open set containing X . Then for every y ∈ Y , one has T − ( y ) = Lag y ( ψ ) . In order to show that the semi-discrete version of NearField problem (NF paral) can be solved using our algorithm, we need to showthat the generating function G satisfies all the hypothesis of Definition 1.The conditions (Reg), (Mono) and (UC) are easy to verify, as mentionedin [1]. This follows from the fact that ( x, y, v ) (cid:55)→ G ( x, y, v ) is continu-ously differentiable in x and v , that ∇ x G ( x, y, v ) = v ( y − x ) and that ∂ v G ( x, y, v ) = − / (2 v ) − v (cid:107) x − y (cid:107) / . The (UC) condition is satisfiedbecause Ω is bounded. Concerning the Twist assumption, F. Abedin and C.Gutierrez [1] introduce a necessary condition that they call Visibility condi-tion . This condition is that for any two point y i , y j ∈ Y the line containingthese two points does not intersect X . Since X and Y lie in the same plane R × { } , this condition is quite restrictive in practice. We show below thatit is not necessary here, since it is sufficient to have the (Twist) Conditionon some interval ]0 , γ [ with γ ∈ R + . Proposition 26.
The function G satisfies the (Twist) condition on X × Y × ]0 , γ [ where γ satisfies γ < inf ( x,y ) ∈ X × Y (cid:107) x − y (cid:107) Proof.
Let x ∈ X , and suppose that G ( x, y , v ) = G ( x, y , v ) and that ∇ x G ( x, y , v ) = ∇ x G ( x, y , v ) , with v i ∈ ]0 , γ ] and y i ∈ Y . The secondcondition implies that v ( y − x ) = v ( y − x ) , which implies that x , y and y are collinear. We then have y − x = ( v /v )( y − x ) . Plugging this inthe relation G ( x, y , v ) = G ( x, y , v ) gives v − v v v (cid:107) x − y (cid:107) = 12 v − v (cid:107) x − y (cid:107) which gives v (1 − v (cid:107) x − y (cid:107) ) = 12 v (1 − v (cid:107) x − y (cid:107) ) , thus we have either ( y , v ) = ( y , v ) or v = 1 / (cid:107) x − y (cid:107) . The latter implyingthat v > γ , which is not possible since by assumption v ≤ γ . It follows that y, v (cid:55)→ ( G ( x, y, v ) , ∇ x G ( x, y, v )) is injective on Y × ]0 , γ ] for any x ∈ X . (cid:3) Laguerre and Möbius diagram.
In order to solve the Generated Ja-cobian equation (NF paral) with the Damped Newton algorithm, we studythe Laguerre diagram induced by the generating function G . We observethat it is a particular instance of a Möbius diagram [3]. This will be use-ful to get a geometric condition that implies genericity (necessary to applyAlgorithm 1) and it will also be used for the numerical computation of theLaguerre diagram. Definition 27 (Möbius diagram) . The Möbius diagram of a family ω =( ω i ) ≤ i ≤ N of N triplets ω i = ( λ i , µ i , p i ) ∈ R × R × R d is the decompositionof the space into Möbius cells M i ( ω ) defined by M i ( ω ) = (cid:110) x ∈ R d | ∀ j ∈ (cid:74) , N (cid:75) , λ i (cid:107) x − p i (cid:107) − µ i ≤ λ j (cid:107) x − p j (cid:107) − µ j (cid:111) A simple calculation shows the boundary of Möbius cells is composed ofarc of (possibly degenerated) circles [3].
Proposition 28.
For any p i (cid:54) = p j , the intersection M i ( ω ) ∩ M j ( ω ) betweentwo Möbius cells is either empty, or an arc of circle whose center belong tothe line passing through p i and p j , or the bisector of p i and p j . Note that if we define λ i = ψ i / , µ i = 1 / ψ i and p i = y i , then theLaguerre cells are Möbius cells, namely Lag i ( ψ ) = M i ( ω ) ∩ Ω . This allows to show that the conditions (
Gen Y Ω ) and ( Gen
Y∂X ) that are re-quired to show the convergence of Algorithm 1 are not restrictive. Indeed, bythe previous proposition, the interface Γ ij ( ψ ) between the two Laguerre cellsassociated to y i and y j is contained in a circle for which the center is on theline passing through y i and y j . This circle can degenerate into a line, in thiscase it is the bisector between y i and y j . Suppose that Y does not containthree colinear points, then for any distinct i, j, k , Γ ijk ( ψ ) is the intersectionof two circles with different centers and ( Gen Y Ω ) is satisfied. Similarly if ∂X doesn’t contain any circle arc, nor bisectors of any two points of Y , then( Gen
Y∂X ) is also satisfied. This allows to prove the following theorem.
DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 21
Theorem 29.
Suppose that Y does not contain three aligned points, andthat ∂X doesn’t contain any circle arc, nor bisectors of any two points of Y .Assuming that the measures µ and ν satisfy the mass balance µ ( X ) = ν ( Y ) and that µ is absolutely continuous with a continuous density ρ such that int( X ) ∩ { ρ > } is path-connected. Then the Damped newton Algorithm(Algorithm 1) converges toward a solution of (NF paral) .Remark . The Generating function is defined on Ω × Y × ]0 , γ [ instead of Ω × Y × R . As mentioned in Remark 2, if ζ : R → ]0 , γ [ is a C -diffeomorphism,then the function (cid:101) G defined by (cid:101) G ( x, y, v ) = G ( x, y, ζ ( v )) is a generatingfunction defined on Ω × Y × R and we can apply Algorithm 1 to (cid:101) G .4.3. Implementation.
The main difficulty in the implementation of theNewton algorithm is the evaluation of the function H and of its differential DH , which requires an accurate computation of the Laguerre diagram. Forthis, we use the fact that a Möbius diagram can be obtained by intersectinga 3D Power diagram with a paraboloid [3]. Definition 31 (Power diagram) . The power diagram of a set of N weightedpoints P = (( p i , r i )) ≤ i ≤ N where p i ∈ R d and r i ∈ R is the decomposition ofthe space into Power cells given by Pow i ( P ) = { x ∈ R d |∀ j ∈ (cid:74) , N (cid:75) : (cid:107) x − p i (cid:107) − r i ≤ (cid:107) x − p j (cid:107) − r j } Proposition 32.
The Laguerre cells associated to the generating function G defined in (4.14) are given for any i by Lag i ( ψ ) = Π(Pow i ( P ) ∩ P ) ∩ Ω , where P is the paraboloid in R parametrized by x = x + x , Π is theprojection of R on R defined by Π( x, y, z ) = ( x, y ) , and (Pow i ( P )) ≤ i ≤ N is the Power diagram associated to the weighted points P given by ∀ i ∈ (cid:74) , N (cid:75) : p i = (cid:18) ψ i y i , − ψ i (cid:19) r i = ψ i
16 + ψ i (cid:107) y i (cid:107) − ψ i (cid:107) y i (cid:107) ψ i (4.15)In our implementation of the algorithm, the intersection of power diagramswith a paraboloid is computed using an algorithm presented in [13]. Once thediagram is computed, the function H and its differential DH are computedusing the trapezoidal rule. Numerical experiments are performed with X =[ − , and µ equal to (one fourth) of the restriction of the Lebesgue measureon X . The set Y is randomly generated in the square [0 , for differentvalues of N , associated with a discrete uniform measure ν . Figure 2 (left)shows the initial diagram (Lag i ( ψ )) ≤ i ≤ N with N = 5000 for some vector ψ = λ with λ > . Figure 2 (right) is the same diagram after convergenceof the algorithm, where ψ is an approximate solution of (NF paral). Thegraph of Figure 3 represents the error (cid:107) H ( ψ k ) − ν (cid:107) as a function of iteration k . It shows superlinear convergence of the damped Newton method. Acknowledgements.
We acknowledge the support of the French AgenceNationale de la Recherche through the project MAGA (ANR-16-CE40-0014).
Figure 2.
Initial diagram for N = 5000 , and final diagram,after convergence of the algorithm. Figure 3.
Numerical error (cid:107) H ( ψ k ) − ν (cid:107) as a function ofthe iteration k . References [1] Farhan Abedin and Cristian E Gutiérrez. An iterative method for generated jacobianequations.
Calculus of Variations and Partial Differential Equations , 56(4):101, 2017.[2] Robert J Berman. Convergence rates for discretized monge–ampère equations andquantitative stability of optimal transport.
Foundations of Computational Mathe-matics , pages 1–42, 2020.[3] Jean-Daniel Boissonnat, Camille Wormser, and Mariette Yvinec. Curved voronoi di-agrams.
Effective Computational Geometry for Curves and Surfaces , 01 2007.
DAMPED NEWTON ALGORITHM FOR GENERATED JACOBIAN EQUATIONS 23 [4] Luis A Caffarelli, Sergey A Kochengin, and Vladimir I Oliker. Problem of reflectordesign with given far-field scattering data. In
Monge Ampère equation: applicationsto geometry and optimization , volume 226, pages 13–32, 1999.[5] Alfred Galichon, Scott Duke Kominers, and Simon Weber. Costly concessions: Anempirical framework for matching with imperfectly transferable utility.
Journal ofPolitical Economy , 127(6):2875–2925, 2019.[6] Nestor Guillen. A primer on generated jacobian equations: Geometry, optics, eco-nomics.
Notices of the American Mathematical Society , 66:1, 10 2019.[7] Nestor Guillen and Jun Kitagawa. Pointwise estimates and regularity in geometricoptics and other generated jacobian equations.
Communications on Pure and AppliedMathematics , 70(6):1146–1220, 2017.[8] Cristian E Gutiérrez and Federico Tournier. Regularity for the near field parallel re-fractor and reflector problems.
Calculus of Variations and Partial Differential Equa-tions , 54(1):917–949, 2015.[9] Feida Jiang and Neil S Trudinger. On pogorelov estimates in optimal transportationand geometric optics.
Bulletin of Mathematical Sciences , 4(3):407–431, 2014.[10] Jun Kitagawa. An iterative scheme for solving the optimal transportation problem.
Calculus of Variations and Partial Differential Equations , 51(1-2):243–263, 2014.[11] Jun Kitagawa, Quentin Mérigot, and Boris Thibert. Convergence of a newton al-gorithm for semi-discrete optimal transport.
Journal of the European MathematicalSociety , 21(9):2603–2651, 2019.[12] Sergey A Kochengin and Vladimir I Oliker. Determination of reflector surfaces fromnear-field scattering data.
Inverse Problems , 13(2):363, 1997.[13] Pedro Machado Manhães De Castro, Quentin Mérigot, and Boris Thibert. Far-field re-flector problem and intersection of paraboloids.
Numerische Mathematik , 134(2):389–411, 2016.[14] Quentin Mérigot, Jocelyn Meyron, and Boris Thibert. An algorithm for optimal trans-port between a simplex soup and a point cloud.
SIAM Journal on Imaging Sciences ,11(2):1363–1389, 2018.[15] Quentin Merigot and Boris Thibert. Optimal transport: discretization and algo-rithms. In
Handbook of Numerical Analysis , volume 22. Elsevier, to appear, 2021.[16] Georg Nöldeke and Larry Samuelson. The implementation duality.
Econometrica ,86(4):1283–1324, 2018.[17] VI Oliker and LD Prussner. On the numerical solution of the equation ∂ z∂x ∂ z∂y − (cid:16) ∂ z∂x∂y (cid:17) = f and its discretizations, I. Numerische Mathematik , 54(3):271–293, 1989.[18] Vladimir Oliker. Mathematical aspects of design of beam shaping surfaces in geomet-rical optics. In
Trends in Nonlinear Analysis , pages 193–224. Springer, 2003.[19] Neil S Trudinger. On the local theory of prescribed jacobian equations.
Discrete &Continuous Dynamical Systems-A , 34(4):1663–1681, 2014.
Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, 38000 Grenoble, France
Email address : [email protected] Université Paris-Saclay, CNRS, Laboratoire de mathématiques d’Orsay,91405, Orsay, France and Institut universitaire de France (IUF)
Email address : [email protected] Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, 38000 Grenoble, France
Email address ::